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Preface 


Both volumes in this series are about what mathematicians, especially logicians, 
call the “foundations” (of mathematics) — that is, the tools of the axiomatic 
method, an assessment of their effectiveness, and two major examples of ap- 
plication of these tools, namely, in the development of number theory and set 
theory. 

There have been, in hindsight, two main reasons for writing this volume. 
One was the existence of notes I wrote for my lectures in mathematical logic 
and computability that had been accumulating over the span of several years 
and badly needed sorting out. The other was the need to write a small section 
on logic, “A Bit of Logic” as I originally called it, that would bootstrap my 
volume on set theory’ on which I had been labouring for a while. Well, one 
thing led to another, and a 30 or so page section that I initially wrote for the 
latter purpose grew to become a self-standing volume of some 300 pages. You 
see, this material on logic is a good story and, as with all good stories, one does 
get carried away wanting to tell more. 

I decided to include what many people will consider, I should hope, as 
being the absolutely essential topics in proof, model, and recursion theory — 
“absolutely essential” in the context of courses taught near the upper end of 
undergraduate, and at the lower end of graduate curricula in mathematics, com- 
puter science, or philosophy. But no more.! This is the substance of Chapter I; 
hence its title “Basic Logic”. 


+ A chapter by that name now carries out these bootstrapping duties — the proverbial “Chapter 0” 
(actually Chapter I) of volume 2. 

= These topics include the foundation and development of non-standard analysis up to the ex- 
treme value theorem, elementary equivalence, diagrams, and L6wenheim-Skolem theorems, and 
Gédel’s first incompleteness theorem (along with Rosser’s sharpening). 
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But then it occurred to me to also say something about one of the most 
remarkable theorems of logic — arguably the most remarkable — about the lim- 
itations of formalized theories: Gédel’s second incompleteness theorem. Now, 
like most reasonable people, I never doubted that this theorem is true, but, as the 
devil is in the details, I decided to learn its proof — right from Peano’s axioms. 
What better way to do this than writing down the proof, gory details and all? 
This is what Chapter II is about.’ 

As a side effect, the chapter includes many theorems and techniques of one 
of the two most important — from the point of view of foundations — “applied” 
logics (formalized theories), namely, Peano arithmetic (the other one, set theory, 
taking all of volume 2). 

Ihave hinted above that this (and the second) volume are aimed at a fairly 
advanced reader: The level of exposition is designed to fit a spectrum of math- 
ematical sophistication from third year undergraduate to junior graduate level 
(each group will find here its favourite sections that serve its interests and level 
of preparation — and should not hesitate to judiciously omit topics). 

There are no specific prerequisites beyond some immersion in the “proof 
culture’, as this is attainable through junior level courses in calculus, linear al- 
gebra, or discrete mathematics. However, some familiarity with concepts from 
elementary naive set theory such as finiteness, infinity, countability, and un- 
countability will be an asset.! 


A word on approach. I have tried to make these lectures user-friendly, and thus 
accessible to readers who do not have the benefit of an instructor’s guidance. 
Devices to that end include anticipation of questions, frequent promptings for 
the reader to rethink an issue that might be misunderstood if glossed over 
(“Pauses”), and the marking of important passages, by , as well as those that 
can be skipped at first reading, by @ ©. 

Moreover, I give (mostly) very detailed proofs, as I know from experience 
that omitting details normally annoys students. 


¥ It is strongly conjectured here that this is the only complete proof in print other than the one 
that was given in Hilbert and Bernays (1968). It is fair to clarify that I use the term “complete 
proof” with a strong assumption in mind: That the axiom system we start with is just Peano 
arithmetic. Proofs based on a stronger — thus technically more convenient — system, namely, 
primitive recursive arithmetic, have already appeared in print (Diller (1976), Smoryniski (1985)). 
The difficulty with using Peano arithmetic as the starting point is that the only primitive recursive 
functions initially available are the successor, identity, plus, and times. An awful amount of work 
is needed — a preliminary “coding trick” — to prove that all the rest of the primitive recursive 
functions “exist”. By then are we already midway in Chapter II, and only then are we ready to 
build Gédel numbers of terms, formulas, and proofs and to prove the theorem. 

I have included a short paragraph nicknamed “‘a crash course on countable sets” (Section 1.5, 


He 


p. 62), which certainly helps. But having seen these topics before helps even more. 
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The first chapter has a lot of exercises (the second having proportionally 
fewer). Many of these have hints, but none are marked as “hard” vs. “just about 
right’, a subjective distinction I prefer to avoid. In this connection here is some 
good advice I received when I was a graduate student at the University of 
Toronto: “Attempt all the problems. Those you can do, don’t do. Do the ones 
you cannot”. 


What to read. Consistently with the advice above, I suggest that you read this 
volume from cover to cover — including footnotes! — skipping only what you 
already know. Now, in a class environment this advice may be impossible to 
take, due to scope and time constraints. An undergraduate (one semester) course 
in logic at the third year level will probably cover Sections I.1—-I.5, making light 
of Section 1.2, and will introduce the student to the elements of computability 
along with a hand-waving “proof” of Gédel’s first incompleteness theorem (the 
“semantic version” ought to suffice). A fourth year class will probably attempt 
to cover the entire Chapter I. A first year graduate class has no more time than 
the others at its disposal, but it usually goes much faster, skipping over familiar 
ground, thus it will probably additionally cover Peano arithmetic and will get 
to see how Gédel’s second theorem follows from Léb’s derivability conditions. 


Acknowledgments. I wish to offer my gratitude to all those who taught me, 
a group led by my parents and too large to enumerate. I certainly include my 
students here. I also include Raymond Wilder’s book on the foundations of 
mathematics, which introduced me, long long ago, to this very exciting field 
and whetted my appetite for more (Wilder (1963)). 

I should like to thank the staff at Cambridge University Press for their pro- 
fessionalism, support, and cooperation, with special appreciation due to Lauren 
Cowles and Caitlin Doggart, who made all the steps of this process, from ref- 
ereeing to production, totally painless. 

This volume is the last installment of a long project that would have not been 
successful without the support and warmth of an understanding family (thank 
you). 

I finally wish to record my appreciation to Donald Knuth and Leslie Lamport 
for the typesetting tools TEX and 47x that they have made available to the tech- 
nical writing community, making the writing of books such as this one almost 
easy. 


George Tourlakis 
Toronto, March 2002 
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Basic Logic 


Logic is the science of reasoning. Mathematical logic applies to mathematical 
reasoning — the art and science of writing down deductions. This volume is 
about the form, meaning, use, and limitations of logical deductions, also called 
proofs. While the user of mathematical logic will practise the various proof 
techniques with a view of applying them in everyday mathematical practice, 
the student of the subject will also want to know about the power and limitations 
of the deductive apparatus. We will find that there are some inherent limitations 
in the quest to discover truth by purely formal — that is, syntactic — techniques. 
In the process we will also discover a close affinity between formal proofs and 
computations that persists all the way up to and including issues of limitations: 
Not only is there a remarkable similarity between the types of respective limi- 
tations (computations vs. uncomputable functions, and proofs vs. unprovable, 
but “true’’, sentences), but, in a way, you cannot have one type of limitation 
without having the other. 

The modern use of the term mathematical logic encompasses (at least) the 
areas of proof theory (it studies the structure, properties, and limitations of 
proofs), model theory (it studies the interplay between syntax and meaning — or 
semantics — by looking at the algebraic structures where formal languages are 
interpreted), recursion theory (or computability, which studies the properties 
and limitations of algorithmic processes), and set theory. The fact that the last- 
mentioned will totally occupy our attention in volume 2 is reflected in the 
prominence of the term in the title of these lectures. It also reflects a tendency, 
even today, to think of set theory as a branch in its own right, rather than as an 
“area” under a wider umbrella. 
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Volume | is a brief study of the other three areas of logic! mentioned above. 
This is the point where an author usually apologizes for what has been omitted, 
blaming space or scope (or competence) limitations. Let me start by outlin- 
ing what is included: “Standard” phenomena such as completeness, compact- 
ness and its startling application to analysis, incompleteness or unprovabil- 
ity (including a complete proof of the second incompleteness theorem), and a 
fair amount of recursion theory are thoroughly discussed. Recursion theory, 
or computability, is of interest to a wide range of audiences, including stu- 
dents with main areas of study such as computer science, philosophy, and, of 
course, mathematical logic. It studies among other things the phenomenon of 
uncomputability, which is closely related to that of unprovability, as we see in 
Section 1.9. 

Among the topics that I have deliberately left out are certain algebraic tech- 
niques in model theory (such as the method of ultrapowers), formal interpre- 
tations of one theory into another,’ the introduction of “other” logics (modal, 
higher order, intuitionistic, etc.), and several topics in recursion theory (oracle 
computability, Turing reducibility, recursive operators, degrees, Post’s theorem 
in the arithmetic hierarchy, the analytic hierarchy, etc.) — but then, the decision 
to stop writing within 300 or so pages was firm. On the other hand, the topics 
included here form a synergistic whole in that I have (largely) included at every 
stage material that is prerequisite to what follows. The absence of a section on 
propositional calculus is deliberate, as it does not in my opinion further the 
understanding of logic in any substantial way, while it delays one’s plunging 
into what really matters. To compensate, I include all tautologies as “proposi- 
tional” (or Boolean) logical axioms and present a mini-course on propositional 
calculus in the exercises of this chapter (1.26-I.41, pp. 193-195), including the 
completeness and compactness of the calculus. 

It is inevitable that the language of sets intrudes in this chapter (as it indeed 
does in all mathematics) and, more importantly, some of the results of (informal) 
set theory are needed here (especially in our proofs of the completeness and 
compactness metatheorems). Conversely, formal set theory of volume 2 needs 
some of the results developed here. This “chicken or egg” phenomenon is often 
called “bootstrapping” (not to be confused with “circularity” — which it is not*), 
the term suggesting one pulling oneself up by one’s bootstraps.‘ 


1 T trust that the reader will not object to my dropping the qualifier “mathematical” from now on. 

Although this topic is included in volume 2 (Chapter I), since it is employed in the relative 
consistency techniques applied there. 

8 Only informal, or naive, set theory notation and results are needed in Chapter I at the meta-level, 
i.e, outside the formal system that logic is. 

4 Tam told that Baron Miinchhausen was the first one to apply this technique, with success. 
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This is a good place to outline how our story will unfold: First, our objective is to 
formalize the rules of reasoning in general — as these apply to all mathematics — 
and develop their properties. In particular, we will study the interaction between 
formalized rules and their “intended meaning” (semantics), as well as the limi- 
tations of these formalized rules: That is, how good (= potent) are they for 
capturing the informal notions of truth? 

Secondly, once we have acquired these tools of formalized reasoning, we start 
behaving (mostly!) as users of formal logic so that we can discover important 
theorems of two important mathematical theories: Peano arithmetic (Chapter IT) 
and set theory (volume 2). 

By formalization (of logic) we understand the faithful representation or 
simulation of the “reasoning processes” of mathematics in general (pure logic), 
or of a particular mathematical theory (applied logic: e.g., Peano arithmetic), 
within an activity that — in principle — is driven exclusively by the form or syntax 
of mathematical statements, totally ignoring their meaning. 

We build, describe, and study the properties of this artificial replica of the 
reasoning processes — the formal theory — within “everyday mathematics” (also 
called “informal” or “real” mathematics), using the usual abundance of mathe- 
matical symbolism, notions, and techniques available to us, augmented by the 
descriptive power of English (or Greek, or French, or German, or Russian, 
or..., aS particular circumstances or geography might dictate). This milieu 
within which we build, pursue, and study our theories is often called the meta- 
theory, or more generally, metamathematics. The language we speak while at 
it, this mélange of mathematics and “natural language”, is the metalanguage. 

Formalization turns mathematical theories into mathematical objects that 
we can study. For example, such study may include interesting questions such 
as “is the continuum hypothesis provable from the axioms of set theory?” or 
“can we prove the consistency of (axiomatic) Peano arithmetic within Peano 
arithmetic?’ This is analogous to building a “model airplane”, a replica of the 
real thing, with a view of studying through the replica the properties, power, 
and limitations of the real thing. 

But one can also use the formal theory to generate theorems, i.e., discover 
“truths” in the real domain by simply “running” the simulation that this theory- 
replica is.’ Running the simulation “by hand” (rather than using the program 


+ 


Some tasks in Chapter II of this volume, and some others in volume 2, will be to treat the “theory” 
at hand as an object of study rather than using it, as a machine, to crank out theorems. 

By the way, the answer to both these questions is “no” (Cohen (1963) for the first, Gddel (1938) 
for the second). 

The analogy implied in the terminology “running the simulation” is apt. For formal theories such 


+e 
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as set theory and Peano arithmetic we can build within real mathematics a so-called “provability 
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of the previous footnote) means that you are acting as a “user” of the formal 
system, a formalist, proving theorems through it. It turns out that once you get 
the hang of it, it is easier and safer to reason formally than to do so informally. 
The latter mode often mixes syntax and semantics (meaning), and there is 
always the danger that the “user” may assign incorrect (i.e., convenient, but not 
general ) meanings to the symbols that he’ manipulates, a phenomenon that has 
distressed many a mathematics or computer science instructor. 

“Formalism for the user” is hardly a revolutionary slogan. It was advocated 
by Hilbert, the founder of formalism, partly as a means of — as he believed* — 
formulating mathematical theories in a manner that allows one to check them 
(i.e., run “diagnostic tests” on them) for freedom from contradiction,’ but also 
as the right way to “do” mathematics. By this proposal he hoped to salvage 
mathematics itself, which, Hilbert felt, was about to be destroyed by the Brouwer 
school of intuitionist thought. In a way, his program could bridge the gap 
between the classical and the intuitionist camps, and there is some evidence 
that Heyting (an influential intuitionist and contemporary of Hilbert) thought 
that such a rapprochement was possible. After all, since meaning is irrelevant to 
a formalist, then all that he is doing (in a proof) is shuffling finite sequences of 
symbols, never having to handle or argue about infinite objects — a good thing, 
as far as an intuitionist is concerned.‘ 


predicate’, that is, a relation P(y, x) which is true of two natural numbers y and x just in case y 
codes a proof of the formula coded by x. It turns out that P(y, x) has so simple a structure that it 
is programmable, say in the C programming language. But then we can write a program (also in 
C) as follows: “Systematically generate all the pairs of numbers (y, x). For each pair generated, 
if P(y, x) holds, then print the formula coded by x”. Letting this process run for ever, we obtain 
a listing of all the theorems of Peano arithmetic or set theory! This fact does not induce any 
insomnia in mathematicians, since this is an extremely impractical way to obtain theorems. By 
the way, we will see in Chapter II that either set theory or Peano arithmetic is sufficiently strong 
to formally express a provability predicate, and this leads to the incompletableness phenomenon. 
In this volume, the terms “he”, “his”, “him”, and their derivatives are by definition gender-neutral. 
This belief was unfounded, as Gédel’s incompleteness theorems showed. 
Hilbert’s metatheory — that is, the “world” or “lab” outside the theory, where the replica is 
actually manufactured — was finitary. Thus — Hilbert advocated — all this theory building and 
theory checking ought to be effected by finitary means. This ingredient of his “program” was 
consistent with peaceful coexistence with the intuitionists. And, alas, this ingredient was the one 
that — as some writers put it — destroyed Hilbert’s program to found mathematics on his version 
of formalism. Gédel’s incompleteness theorems showed that a finitary metatheory is not up to 
the task. 

1 True, a formalist applies classical logic, while an intuitionist applies a different logic where, for 
example, double negation is not removable. Yet, unlike a Platonist, a Hilbert-style formalist does 
not believe — or he does not have to disclose to his intuitionist friends that he might believe — that 
infinite sets exist in the metatheory, as his tools are just finite symbol sequences. To appreciate the 
tension here, consider this anecdote: It is said that when Kronecker — the father of intuitionism — 
was informed of Lindemann’s proof (1882) that z is transcendental, while he granted that this was 
an interesting result, he also dismissed it, suggesting that “7” — whose decimal expansion is, of 
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In support of the “formalism for the user” position we must definitely men- 
tion the premier paradigm, Bourbaki’s monumental work (1966a), which is a 
formalization of a huge chunk of mathematics, including set theory, algebra, 
topology, and theory of integration. This work is strictly for the user of mathe- 
matics, not for the metamathematician who studies formal theories. Yet, it is 
fully formalized, true to the spirit of Hilbert, and it comes in a self-contained 
package, including a “Chapter 0” on formal logic. 

More recently, the proposal to employ formal reasoning as a tool has been 
gaining support in anumber of computer science undergraduate curricula, where 
logic and discrete mathematics are taught in a formalized setting, starting with 
a rigorous course in the two logical calculi (propositional and predicate), em- 
phasizing the point of view of the user of logic (and mathematics) — hence with 
an attendant emphasis on “calculating” (i.e., writing and annotating formal) 
proofs. Pioneering works in this domain are the undergraduate text (1994) and 
the paper (1995) of Gries and Schneider. 


1.1. First Order Languages 


In the most abstract (therefore simplest) manner of describing it, a formalized 
mathematical theory consists of the following sets of things: A set of basic 
or primitive symbols, 7, used to build symbol sequences (also called strings, 
or expressions, or words) “over 7”. A set of strings, Wff, over 7, called the 
formulas of the theory. Finally, a subset of Wff, called Thm, the set of theorems 
of the theory.! 

Well, this is the extension of a theory, that is, the explicit set of objects in it. 
How is a theory “given”? 

In most cases of interest to the mathematician it is given by 7 and two 
sets of simple rules: formula-building rules and theorem-building rules. Rules 
from the first set allow us to build, or generate, Wff from 7. The rules of the 
second set generate Thm from Wff. In short (e.g., Bourbaki (1966b)), a theory 
consists of an alphabet of primitive symbols, some rules used to generate the 
“language of the theory” (meaning, essentially, Wff) from these symbols, and 
some additional rules used to generate the theorems. We expand on this below: 


course, infinite but not periodic — “does not exist” (see Wilder (1963, p. 193)). We are not to pro- 
pound the tenets of intuitionism here, but it is fair to state that infinite sets are possible in intuition- 
istic mathematics as this has later evolved in the hands of Brouwer and his Amsterdam “school”. 
However, such sets must be (like all sets of intuitionistic mathematics) finitely generated — just 
as our formal languages and the set of theorems are (the latter provided our axioms are too) — in 
a sense that may be familiar to some readers who have had a course in “automata and language 
theory”. See Wilder (1963, p. 234) 
} Fora less abstract, but more detailed view of theories see p. 38. 
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1.1.1 Remark. What is a “rule”? We run the danger of becoming circular or too 
@ pedantic if we overdefine this notion. Intuitively, the rules we have in mind are 
string manipulation rules, that is, “black boxes” (or functions) that receive string 
inputs and respond with string outputs. For example, a well-known theorem- 
building rule receives as input a formula and a variable, and returns (essentially) 
the string composed of the symbol V, immediately followed by the variable and, 
in turn, immediately followed by the formula.‘ © 


(1) First off, the (first order) formal language, L, where the theory is “spoken”, 
is a triple (7, Term, Wff), that is, it has three important components, each 
of them a set. 

VY isthe alphabet or vocabulary of the language. It is the collection of the 
basic syntactic “bricks” (symbols) that we use to form expressions that 
are terms (members of Term) or formulas (members of Wff). We will 
ensure that the processes that build terms or formulas, using the basic 
building blocks in ”, are intuitively algorithmic or “mechanical”. 

Terms will formally codify “objects”, while formulas will formally 
codify “statements” about objects. 

(2) Reasoning in the theory will be the process of discovering true statements 
about objects — that is, theorems. This discovery journey begins with certain 
formulas which codify statements that we take for granted (i.e., we accept 
without “proof” as “basic truths”). Such formulas are the axioms. There are 
two types of axioms: 

Special or nonlogical axioms are to describe specific aspects of any 
specific theory that we might be building. For example, “x + 1 4 0” 
is a special axiom that contributes towards the characterization of 
number theory over the natural numbers, N. 

The other kind of axiom will be found in ail theories. It is the kind that is 
“universally valid’, that is, not theory-specific (for example, “x = x” 
is such a “universal truth’’). For that reason this type of axiom will be 
called logical. 

(3) Finally, we will need rules for reasoning, actually called rules of inference. 
These are rules that allow us to deduce, or derive, a true statement from 
other statements that we have already established as being true.’ These 
rules will be chosen to be oblivious to meaning, being only concerned with 


1 This rule is usually called “generalization”. 

! We will soon say what makes a language “first order”. 

8 The generous use of the term “true” here is only meant for motivation. “Provable” or “deducible” 
(formula), or “theorem”, will be the technically precise terminology that we will soon define to 
replace the term “true statement”. 
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form. They will apply to statement “configurations” of certain recognizable 
forms and will produce (derive) new statements of some corresponding 
recognizable forms (See Remark I.1.1). 


1.1.2 Remark. We may think of axioms of either logical or nonlogical type as 
special cases of rules, that is, rules that receive no input in order to produce an 
output. In this manner item (2) above is subsumed by item (3), and thus we are 
faithful to our abstract definition of theory where axioms were not mentioned. 

An example, outside mathematics, of an inputless rule is the rule invoked 
when you type date on your computer keyboard. This rule receives no input, 
and outputs on your screen the current date. 


We next look carefully into (first order) formal languages. 

There are two parts in each first order alphabet. The first, the collection of 
the logical symbols, is common to all first order languages regardless of which 
theory is “spoken” in them. We describe this part immediately below. 


Logical Symbols 


LS.1. Object or individual variables. An object variable is any one symbol 
out of the non-ending sequence vo, v1, v2,.... In practice — whether 
we are using logic as a tool or as an object of study — we agree to be 
sloppy with notation and use, generically, x, y, z, u, v, w with or without 
subscripts or primes as names of object variables.’ This is just a matter 
of notational convenience. We allow ourselves to write, say, z instead of, 
say, V1200000000560000009- Object variables (intuitively) “vary over” (i.e., 
are allowed to take values that are) the objects that the theory studies 
(numbers, sets, atoms, lines, points, etc., as the case may be). 

LS.2. The Boolean or propositional connectives. These are the symbols ““—” 
and “v”.+ They are pronounced not and or respectively. 

LS.3. The existential quantifier, that is, the symbol “3”, pronounced exists or 
for some. 

LS.4. Brackets, that is, “( and “)”. 

LS.5. The equality predicate. This is the symbol “= 
that objects are “equal”. It is pronounced equals. 


> 


’, which we use to indicate 


+ Conventions such as this one are essentially agreements — effected in the metatheory — on how 
to be sloppy and get away with it. They are offered in the interest of user-friendliness. 

= The quotes are not part of the symbol. They serve to indicate clearly here, in particular in the 
case of “Vv”, what is part of the symbol and what is not (the following period). 
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The logical symbols will have a fixed interpretation. In particular, “=” will 
always be expected to mean equals. 


The theory-specific part of the alphabet is not fixed, but varies from theory 
to theory. For example, in set theory we just add the nonlogical (or special) 
symbols, € and U. The first is a special predicate symbol (or just predicate) of 
arity 2, the second is a predicate symbol of arity 1.' 

In number theory we adopt instead the special symbols S (intended meaning: 
successor, or “+ 1” function), +, x, 0, <, and (sometimes) a symbol for the 
exponentiation operation (function) a’. The first three are function symbols of 
arities 1, 2, and 2 respectively. 0 is a constant symbol, < a predicate of arity 2, 
and whatever symbol we might introduce to denote a? would have arity 2. 

The following list gives the general picture. 


Nonlogical Symbols 


NLS.1. 


NLS.2. 


NLS.3. 


A (possibly empty) set of symbols for constants. We normally use 
the metasymbols! a, b, c, d, e, with or without subscripts or primes, to 
stand for constants unless we have in mind some alternative “standard” 
formal notation in specific theories (e.g., 0, 0, w). 

A (possibly empty) set of symbols for predicate symbols or relation 
symbols for each possible “arity” n >0. We normally use P,Q, R 
generically, with or without primes or subscripts, to stand for predicate 
symbols. Note that = is in the logical camp. Also note that theory- 
specific formal symbols are possible for predicates, e.g., <, €. 
Finally, a (possibly empty) set of symbols for functions for each possi- 
ble “arity” n > 0. We normally use f, g, h, generically, with or without 
primes or subscripts, to stand for function symbols. Note that theory- 
specific formal symbols are possible for functions, e.g., +, x. 


1.1.3 Remark. (1) We have the option of assuming that each of the logical 
symbols that we named in LS.1-LS.5 have no further “structure” and that the 
symbols are, ontologically, identical to their names, that is, they are just these 


exact signs drawn on paper (or on any equivalent display medium). 
In this case, changing the symbols, say, — and 4 to ~ and E respectively 
results in a “different” logic, but one that is, trivially, “isomorphic” to the one 


i “Arity” is a term mathematicians have made up. It is derived from “ary” of “unary”, “binary”, 
etc. It denotes the number of arguments needed by a symbol according to the dictates of correct 
syntax. Function and predicate symbols need arguments. 

t Metasymbols are informal (i.e., outside the formal language) symbols that we use within 
“everyday” or “real” mathematics — the metatheory — in order to describe, as we are doing here, 
the formal language. 
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we are describing: Anything that we may do in, or say about, one logic trivially 
translates to an equivalent activity in, or utterance about, the other as long as 
we systematically carry out the translations of all occurrences of — and J to ~ 
and E respectively (or vice versa). 

An alternative point of view is that the symbol names are not the same as 
(identical with) the symbols they are naming. Thus, for example, “—” names 
the connective we pronounce not, but we do not know (or care) exactly what 
the nature of this connective is (we only care about how it behaves). Thus, the 
name “—”’ becomes just a typographical expedient and may be replaced by other 
names that name the same object, not. 

This point of view gives one flexibility in, for example, deciding how the 
variable symbols are “implemented”. It often is convenient to think that the 
entire sequence of variable symbols was built from just two symbols, say, “v” 
and “|”.t One way to do this is by saying that v; is a name for the symbol 
sequence? 


i|’s 


Or, preferably — see (2) below — v; might be a name for the symbol sequence 


i|’s 

Regardless of option, v; and v; will name distinct objects if i 4 j. 

This is not the case for the metavariables (“abbreviated informal names’’) 
xX, y,Z, u,v, w. Unless we say so explicitly otherwise, x and y may name the 
same formal variable, say, V131. 

We will mostly abuse language and deliberately confuse names with the 
symbols they name. For example, we will say, e.g., “let vioo7 be an object 
variable...” rather than “let vj997 name an object variable ...”, thus appearing 
to favour option one. 

(2) Any two symbols included in the alphabet are distinct. Moreover, if any of 
them are built from simpler “sub-symbols” — e.g., vo, v1, v2, ... might really 
name the strings vu, v|v, v||v, ...—then none of them is a substring (or subex- 
pression) of any other.’ 


+ 


We intend these two symbols to be identical to their names. No philosophical or other purpose 
will be served by allowing “more indirection” here (such as “‘v names u, which actually names 
w, which actually is ...”). 

Not including the quotes. 

What we have stated under (2) are requirements, not metatheorems! That is, they are nothing of 


wor te 


the sort that we can prove about our formal language within everyday mathematics. 
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(3) A formal language, just like a “natural” language (such as English or 
Greek), is “alive” and evolving. The particular type of evolution we have in 
mind is the one effected by formal definitions. Such definitions continually add 
nonlogical symbols to the language.' 

Thus, when we say that, e.g., “e and U are the only nonlogical symbols of 
set theory”, we are telling a small white lie. More accurately, we ought to have 
said that “e and U are the only ‘primitive’ nonlogical symbols of set theory”, 
for we will add loads of other symbols such as U, w, @, C, C. 

This evolution affects the (formal) language of any theory, not just set 
theory. 


Wait a minute! If formal set theory is “the foundation of all mathematics”, and 
if, ostensibly, this chapter on logic assists us to found set theory itself, then 
how come we are employing natural numbers like 1200000000560000009 as 
subscripts in the names of object variables? How is it permissible to already talk 
about “sets of symbols” when we are about to found a theory of sets formally? 
Surely we do not “have”! any of these “items” yet, do we? 

First off, the presence of subscripts such as 1200000000560000009 in 


V1200000000560000009 


is a non-issue. One way to interpret what has been said in the definition is 
to view the various v; as abbreviated names of the real thing, the latter being 
strings that employ the symbols v and | as in Remark I.1.3. In this connection 
saying that v; is “implemented” as 


[ian ka () 
——’ 
i|’s 
especially the use of “i” above, is only illustrative, thus totally superfluous. We 
can say instead that strings of type (1) are the variables which we define as 
follows without the help of the “natural number i” (this is a variation of how 


this is done in Bourbaki (1966b) and Hermes (1973)): 


An “|-calculation” forms a string like this: Write a “|”.5 This is the “current 
string”. Repeat a finite number of times: Add (i.e., concatenate) one | imme- 
diately to the right of the current string. Write this new string (it is now the 
current string). 


— 


This phenomenon will be studied in some detail in what follows. By the way, any additions are 
made to the nonlogical side of the alphabet. All the logical symbols have been given, once and 
for all. 

“Do not have” in the sense of having not formally defined — or proved to exist — or both. 
Without the quotes. These were placed to exclude the punctuation following. 
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Let us call any string that figures in some |-calculation a “|-string”’. A variable 
either is the string vv, or is obtained as the concatenation from left to right of 
v followed by an |-string, followed by v. 


All we now need is the ability to generate as many as necessary distinct 
variables (this is the “non-ending sequence” part of the definition, p. 7): For 
any two variables we get a new one that is different from either one by forming 
the string “v, followed by the concatenation of the two |-parts, followed by v”. 
Similarly if we had three, four, ... variables. By the way, two strings of | are 
distinct ifft both occur in the same |-calculation, one, but not both, as the last 
string. 

Another, more direct way to interpret what was said about object variables 
on p. 7 is to take the definition literally, i.e., to suppose that it speaks about the 
ontology of the variables.t Namely, the subscript is just a a string of meaningless 
symbols taken from the list below: 


0, 1,2, 3,4,5, 6,7, 8,9 


Again we can pretend that we know nothing about natural numbers, and when- 
ever, e.g., we want a variable other than either of vj23 or v32;, we may offer 
either of v123321 OF V321123 aS SUCh a new variable. 

O.K., so we have not used natural numbers in the definition. But we did say 
“sets” and also “non-ending sequence”, implying the presence of infinite sets! 


As we have already noted, on one hand we have “real mathematics”, and on 
the other hand we have syntactic replicas of theories — the formal theories — 
that we built within real mathematics. Having built a formal theory, we can then 
choose to use it (acting like formalists) to generate theorems, the latter being 
codified as symbol sequences (formulas). Thus, the assertion “axiomatic set 
theory is the foundation of all mathematics” is just a colloquialism proffered 
in the metatheory that means that “within axiomatic set theory we can construct 
the known sets of mathematics, such as the reals IR and the complex numbers 
C, and moreover we can simulate what we informally do whenever we are 
working in real or complex analysis, algebra, topology, theory of measure and 
integration, functional analysis, etc., etc.” 

There is no circularity here, but simply an empirical boastful observation in 
the metatheory of what our simulator can do. Moreover, our metatheory does 


+ If and only if. 

= Why not just say exactly what a definition is meant to say rather than leave it up to interpretation? 
One certainly could, as in Bourbaki (1966b), make the ontology of variables crystal-clear right in 
the definition. Instead, we have followed the custom of more recent writings and given the defi- 
nition in a quasi-sloppy manner that leaves the ontology of variables as a matter for speculation. 
This gives one the excuse to write footnotes like this one and remarks like I.1.3. 
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have sets and all sorts of other mathematical objects. In principle we can use any 
among those towards building or discussing the simulator, the formal theory. 


Thus, the question is not whether we can use sets, or natural numbers, in 
our definitions, but whether restrictions apply. For example, can we use infinite 
sets? 


If we are Platonists, then we have available in the metatheory all sorts of sets, 
including infinite sets, in particular the set of all natural numbers. We can use 
any of these items, speak about them, etc., as we please, when we are describing 
or building the formal theory within our metatheory. 

Now, if we are not Platonists, then our “real”? mathematical world is much 
more restricted. In one extreme, we have no infinite sets.t 

We can still manage to define our formal language! After all, the “non- 
ending” sequence of object variables vp, v1, v2, ... can be finitely generated in 
at least two different ways, as we have already seen. Thus we can explain (to 
a true formalist or finitist) that “non-ending sequence” was an unfortunate slip 
of the tongue, and that we really meant to give a procedure of how to generate 
on demand a new object variable, different from whatever ones we may already 
have. 

Two parting comments are in order: One, we have been somewhat selective 
in the use of the term “metavariable”. We have called x, x’, y metavariables, 
but have implied that the v; are formal variables, even if they are just names 
of formal objects such that we do not know or do not care what they look like. 
Well, strictly speaking the abbreviations v; are also metavariables, but they are 
endowed with a property that the “generic” metavariables like x, y, z’ do not 
have: Distinct v; names denote distinct object variables (cf. I.1.3). 

Two, we should clarify that a formal theory, when used (i.e., the simulator 
is being “run’”’) is a generator of strings, not a decider or “parser”. Thus, it 
can generate any of the following: variables (if these are given by procedures), 
formulas and terms (to be defined), or theorems (to be defined). Decision issues, 
no matter how trivial, the system is not built to handle. These belong to the 
metatheory. In particular, the theory does not see whatever numbers or strings 
(like 12005) may be hidden in a variable name (such as v1 2995). 

Examples of decision questions: Is this string a term or a formula or a variable 
(finitely generated as above)? All these questions are “easy”. They are algo- 
rithmically decidable in the metatheory. Or, is this formula a theorem? This is 


1 A finitist — and don’t forget that Hilbert-style metatheory was finitary, ostensibly for political 
reasons — will let you have as many integers as you like in one serving, as long as the serving 
is finite. If you ask for more, you can have more, but never the set of all integers or an infinite 
subset thereof. 
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algorithmically undecidable in the metatheory if it is a question about Peano 


arithmetic or set theory. 7 7 


1.1.4 Definition (Terminology about Strings). A symbol sequence or expres- 
sion (or string) that is formed by using symbols exclusively out of a given seti 
M is called a string over the set, or alphabet, M. 

If A and B denote strings (say, over M), then the symbol A « B, or more 
simply AB, denotes the symbol sequence obtained by listing first the symbols 
of A in the given left to right sequence, immediately followed by the symbols of 
B in the given left to right sequence. We say that AB is (more properly, denotes 
or names) the concatenation of the strings A and B in that order. 

We denote the fact that the strings (named) C and D are identical sequences 
(but we just say that they are equal) by writing C = D. The symbol ¥ denotes 
the negation of the string equality symbol =. Thus, if # and ? are (we do mean 
“are”) symbols from an alphabet, then 


#22 = #2? but #2? 4 #2? 


We can also employ = in contexts such as “let A = ##?”, where we give the 
name A to the string ##?.# 


& In this book the symbol = will be exclusively used in the metatheory for equality 
of strings over some set M. 


The symbol A normally denotes the empty string, and we postulate for it the 
following behaviour: 


A=AR=dA for all strings A 


We say that A occurs in B, or is a substring of B, iff there are strings C and D 
such that B = CAD. 

For example, “(” occurs four times in the (explicit) string ““-=(QV)((’, at 
positions 2, 3, 7, 8. Each time this happens we have an occurrence of “(” in 
“(OVC . 

If C = 1, we say that A is a prefix of B. If moreover D ¥ i, then we say 
that A is a proper prefix of B. 


+ A set that supplies symbols to be used in building strings is not special. It is just a set. However, 
it often has a special name: “alphabet”. 

* Punctuation such as “.” is not part of the string. One often avoids such footnotes by enclosing 
strings that are explicitly written as symbol sequences inside quotes. For example, if A stands 
for the string #, one writes A = “#’. Note that we must not write “A”, unless we mean a string 
whose only symbol is A. 


14 I. Basic Logic 


1.1.5 Definition (Terms). The set of terms, Term, is the smallest set of strings 
over the alphabet 7 with the following two properties: 


(1) All of the items in LS.1 or NLS.1 (x, y, z, a, b, c, etc.) are included. 
(2) If f is a functiont of arity n and f), f2,..., t, are included, then so is the 


” 


string “ftit...t,”. 


The symbols ¢t, s, and u, with or without subscripts or primes, will denote 
arbitrary terms. Since we are using them in the metalanguage to “vary over” 
terms, we naturally call them metavariables. They also serve — as variables — 
towards the definition (this one) of the syntax of terms. For this reason they are 
also called syntactic variables. 


1.1.6 Remark. (1) We often abuse notation and write f(t), ...,t,) instead of 
ftt...th 

(2) Definition 1.1.5 is an inductive definition? It defines a more or less 
“complicated” term by assuming that we already know what “simpler” terms 
look like. This is a standard technique employed in real mathematics. We will 
have the opportunity to say more about such inductive definitions — and their 
appropriateness — ina © ©-comment later on. 

(3) We relate this particular manner of defining terms to our working def- 
inition of a theory (given on p. 6 immediately before Remark I.1.1 in terms 
of “rules” of formation). Item (2) in I.1.5 essentially says that we build new 
terms (from old ones) by applying the following general rule: Pick an arbitrary 
function symbol, say f. This has a specific formation rule associated with it 
that, for the appropriate number, 7, of an already existing ordered list of terms, 
ti,...,t¢,, Will build the new term consisting of f, immediately followed by 
the ordered list of the given terms. 

To be specific, suppose we are working in the language of number theory. 
There is a function symbol + available there. The rule associated with + builds 
the new term ++s for any prior obtained terms ¢ and s. For example, +11 113 
and +v121 + v1 v3 are well-formed terms. We normally write terms of number 
theory in “infix” notation,! ie., f +5, vy + v13 and v121 + (v; + v3) (note the 
intrusion of brackets, to indicate sequencing in the application of +). 


— 


We will omit from now on the qualification “symbol” from terminology such as “function sym- 
bol”, “constant symbol”, “predicate symbol”. 

Some mathematicians will absolutely insist that we call this a recursive definition and reserve 
the term “induction” for “induction proofs”. This is seen to be unwarranted hair splitting if we 
consider that Bourbaki (1966b) calls induction proofs ““démonstrations par récurrence”’. We will 
be less dogmatic: Either name is all right. 

Function symbol placed between the arguments. 


Pas 
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A by-product of what we have just described is that the arity of a function 
symbol f is whatever number of terms the associated rule will require as input. 
(4) A crucial word used in I.1.5 (which recurs in all inductive definitions) is 
“smallest”. It means “least inclusive” (set). For example, we may easily think of 
a set of strings that satisfies both conditions of the above definition, but which is 
not “smallest” by virtue of having additional elements, such as the string “7>—(’’. 


Pause. Why is “——(’’ not in the smallest set as defined above, and therefore 
not a term? 


The reader may wish to ponder further on the import of the qualification 
“smallest” by considering the familiar (similar) example of N, the set of natural 
numbers. The principle of induction in N ensures that this set is the smallest 
with the properties: 


(i) O is included, and 
(ii) if n is included, then so isn + 1. 


By contrast, all of Z (set of integers), Q (set of rational numbers), R (set of real 
numbers) satisfy (i) and (ii), but they are clearly not the “smallest” such. © 


1.1.7 Definition (Atomic Formulas). The set of atomic formulas, Af, contains 
precisely: 


(1) The strings t = s for every possible choice of terms f, s. 
(2) The strings Ptyt. ...t, for every possible choice of n-ary predicates P (for 
all choices of n > 0) and all possible choices of terms t1, f2,..., th. 


© We often abuse notation and write P(t,,...,¢,) instead of Pt, ... ty. cr 


1.1.8 Definition (Well-Formed Formulas). The set of well-formed formulas, 
Wff, is the smallest set of strings or expressions over the alphabet 7 with the 
following properties: 


(a) All the members of Af are included. 

(b) If. Zand .% denote strings (over 7) that are included, then (4 V .#) and 
(=.4) are also included. 

(c) If. isi a string that is included and x is any object variable (which may or 
may not occur (as a substring) in the string _Z), then the string ((Ax). 7) is 
also included. We say that . 4 is the scope of (Ax). 


+ Denotes! 
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1.1.9 Remark. 


(1) The above is yet another inductive definition. Its statement (in the metalan- 
guage) is facilitated by the use of so-called syntactic, or meta, variables — 
_@and.# — used as names for arbitrary (indeterminate) formulas. In gen- 
eral, we will let calligraphic capital letters.4,.7, 0, Y, €,.F, FY (with or 
without primes or subscripts) be names for well-formed formulas, or just 
formulas, as we often say. The definition of Wff given above is standard. 
In particular, it permits well-formed formulas such as ((Ax)((Ax)x = 0)) in 
the interest of making the formation rules “context-free”.' 

(2) The rules of syntax just given do not allow us to write things such as 4 f or 
AP where f and P are function and predicate symbols respectively. That 
quantification is deliberately restricted to act solely on object variables 
makes the language first order. 

(3) We have already indicated in Remark I.1.6 where the arities (of function and 
predicate symbols) come from (Definitions I.1.5 and I.1.7 referred to them). 
These are numbers that are implicit (“hardwired”’) with the formation rules 
for terms and atomic formulas. Each function and each predicate symbol 
(e.g., +, X, €, <) has its own unique formation rule. This rule “knows” how 
many terms are needed (on the input side) in order to form a term or atomic 
formula. Therefore, since the theory, in use, applies rather than studies its 
formation rules, it is, in particular, ignorant of arities of symbols. 

Now that this jurisdictional point has been made (cf. the concluding 
remarks about decision questions, on p. 12), we can consider an alternative 
way of making arities of symbols known (in the metatheory): Rather than 
embedding arities in the formation rules, we can hide them in the ontology 
of the symbols, not making them explicit in the name. 

For example, a new symbol, say +, can be used to record arity. That 
is, we can think of a predicate (or function) symbol as consisting of two 
parts: an arity part and an “all the rest” part, the latter needed to render the 
symbol unique.? For example, € may be actually the name for the symbol 
“Ee” where this latter name is identical to the symbol it denotes, or “what 
you see is what you get” — see Remark I.1.3(1) and (2), p. 8. The presence 
of the two asterisks declares the arity. Some people say this differently: 
They make available to the metatheory a “function”, ar, from “the set of 


1 Tn some presentations, the formation rule in I.1.8(c) is “context-sensitive”: It requires that x be 
not already quantified in .Z. 

* The reader may want to glimpse ahead, on p. 166, to see a possible implementation in the case 
of number theory. 
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all predicate symbols and functions” (of a given language) to the natural 
numbers, so that for any function symbol f or predicate symbol P, ar(f) 
and ar(P) yield the arities of f and P respectively.' 
(4) Abbreviations 
Abr1. The string ((Vx). 4) abbreviates the string “(—((Ax)(—.4)))”. Thus, 
for any explicitly written formula . 4, the former notation is infor- 
mal (metamathematical), while the latter is formal (within the formal 
language). In particular, V is ametalinguistic symbol. “Vx” is the uni- 
versal quantifier. .4 is its scope. The symbol V is pronounced for all. 

We also introduce — in the metalanguage — a number of additional Boolean 

connectives in order to abbreviate certain strings: 

Abr2. (Conjunction, A) ( 4A.) stands for (-=((-.4) V (-.#))). The 
symbol A is pronounced and. 

Abr3. (Classical or material implication, >) (4—.#) stands for 
((-.4)V.#). (4 .#) is pronounced if 4, then .%. 

Abr4. (Equivalence, <= ) (.4<.#) stands for ((4> .2#)A(B>.4)). 

Abr5. To minimize the use of brackets in the metanotation we adopt stan- 
dard priorities of connectives: V, 4, and — have the highest, and then 
we have (in decreasing order of priority) A, V, >, <>, and we agree 
not to use outermost brackets. All associativities are right — that is, 
if we write. 4—> .2 = @, then this is a (sloppy) counterpart for 
(4 >(2B-> @)). 

(5) The language just defined, L, is one-sorted, that is, it has a single sort or 
type of object variable. Is this not inconvenient? After all, our set theory 
(volume 2 of these lectures) will have both atoms and sets. In other theories, 
e.g., geometry, one has points, lines, and planes. One would have hoped to 
have different “types” of variables, one for each. 

Actually, to do this would amount to a totally unnecessary complication 
of syntax. We can (and will) get away with just one sort of object variable. 
For example, in set theory we will also introduce a l-ary* predicate, U, 
whose job is to “test” an object for “sethood”.’ Similar remedies are avail- 
able to other theories. For example, geometry will manage with one sort of 
variable and unary predicates “Point”, “Line”, and “Plane”. 


+ In mathematics we understand a function as a set of input-output pairs. One can “glue” the two 
parts of such pairs together, as in “e**” — where “e” is the input part and “-«” is the output part, 
the latter denoting “2” — etc. Thus, the two approaches are equivalent. 

= More commonly called unary. 

8 People writing about, or teaching, set theory have made this word up. Of course, one means by 
it the property of being a set. 
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Apropos language, some authors emphasize the importance of the 
nonlogical symbols, taking at the same time the formation rules for 
granted; thus they say that we have a language, say, “L = {¢, U}” rather 
than “L = (7, Term, Wff) where 7 has € and U as its only nonlogi- 
cal symbols”. That is, they use “language” for the nonlogical part of the 
alphabet. 


A variable that is quantified is bound in the scope of the quantifier. Non- 
quantified variables are free. We also give below, by induction on formulas, 
precise (metamathematical) definitions of “free” and “bound”. 


1.1.10 Definition (Free and Bound Variables). An object variable x occurs 
free in a term f or atomic formula .4 iff it occurs in ¢ or .4 as a substring 
(see I.1.4). 


x occurs free in (—.4) iff it occurs free in. 4. 

x occurs free in (.4 V .#) iff it occurs free in at least one of .4 or .2%. 

x occurs free in ((Ay).4) iff x occurs free in .4, and y is not the same 
variable as x.! 

The y in (Ay). 4) is, of course, not free — even if it might be so in.4— as 
we have just concluded in this inductive definition. We say that it is bound in 
((Ay).4). Trivially, terms and atomic formulas have no bound variables. 


© 1.1.11 Remark. (1) Of course, Definition I.1.10 takes care of the defined con- 
nectives as well, via the obvious translation procedure. 


(2) Notation. If .4is a formula, then we often write. 4Ly1, ..., yx] to indicate 
our interest in the variables y,,..., yz, which may or may not be free in .4. 


Indeed, there may be other free variables in. 4 that we may have chosen not to 
include in the list. 


On the other hand, if we use round brackets, as in. A(y1,..., yx), then we 
are implicitly asserting that y,,..., yg is the complete list of free variables that 


occur in.4. 


1.1.12 Definition. A term or formula is closed iff no free variables occur in it. 
A closed formula is called a sentence. 

A formula is open iff it contains no quantifiers (thus, an open formula may 
also be closed). 


¥ Recall that x and y are abbreviations of names such as v120909g and v1 1009 (which name distinct 
variables). However, it could be that both x and y name v101. Therefore it is not redundant to say 
“and y is not the same variable as x”. By the way, x # y says the same thing, by I.1.4. 
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1.2. A Digression into the Metatheory: 
Informal Induction and Recursion 


We have already seen a number of inductive or recursive definitions in Sec- 
tion I.1. The reader, most probably, has already seen or used such definitions 
elsewhere. 

We will organize the common important features of inductive definitions 
in this section, for easy reference. We just want to ensure that our grasp of 
these notions and techniques, at the metamathematical level, is sufficient for 
the needs of this volume. 

One builds a set S by recursion, or inductively (or by induction), out of two 
ingredients: a set of initial objects, .Y, and a set of rules or operations, .%. A 
member of .% — a rule — is a (possibly infinite) rable, or relation, like 


y1 Yn Zz 
a an Qn+1 
by Dy Dn4t 


If the above rule (table) is called Q, then we use the notationst 


Q(q1,---54n,4n41) and (aj,...,4n,4n41) € O 
interchangeably to indicate that the ordered sequence or “row” dj, ... , Any An+1 
is present in the table. 

We say that “Q(qa1,..., Gn, Gn41) holds” or “O(a, ..., Gn, An41) 18 true”, 
but we often also say that “Q applied to a),..., dy yields a,41”, or that “ay+4 
is a result or output of Q, when the latter receives input a,..., d,”. We often 
abbreviate such inputs using vector notation, namely, ad, (or just a, if n is 
understood). Thus, we may write Q(4,+1) for O(a, ..., Gn, Qn41)- 


A tule @Q that has n + | columns is called (n + 1)-ary. 


1.2.1 Definition. We say “a set T is closed under an (n + 1)-ary rule Q” to 
mean that whenever c),...,c, are all in 7, thend ¢€ T for all d satisfying 


QO(cl,.--, Cn, a). 


With these preliminary understandings out of the way, we now state 


+ “x © A” means that “x is a member of — or is in— A” in the informal set-theoretic sense. 
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1.2.2 Definition. S is defined by recursion, or by induction, from initial objects 
7 and set of rules .7%, provided it is the smallest (least inclusive) set with the 
properties 


(1) .7¢S;,i 
(2) S is closed under every Q in.%. In this case we say that S is .#-closed. 


We write S = Cl(7,.#%), and say that “S is the closure of 7 under .#%”. 


We have at once: 


1.2.3 Metatheorem (Induction on S). /f S = Cl(7,.%) and if some set T 
satisfies 


(1) ZCT, and 
(2) T is closed under every Q in .%, 


thenS CT. 


Pause. Why is the above a metatheorem? 


& The above principle of induction on S is often rephrased as follows: To prove 
that a property P(x) holds for all members of CIC7, .#), just prove that 


(a) every member of .Y has the property, and 

(b) the property propagates with every rule in.#%, i.e., if P(c;) holds (is true) 
fori = 1,...,n,andif O(c1,..., Cy, d) holds, then d too has the property 
P(x) — that is, P(d) holds. 


Of course, this rephrased principle is valid, for if we let T be the set of all 
objects that have property P(x)—for which set one employs the well-established 
symbol {x : P(x)}—then this T satisfies (1) and (2) of the metatheorem.! er 


1.2.4 Definition (Derivations and Parses). A (.7, .#)-derivation, or simply de- 
rivation—if 7 and.#% are understood — is a finite sequence of objects d), ... , dn 


+ From our knowledge of elementary informal set theory, we recall that A C B means that every 
member of A is also a member of B. 

! We are sailing too close to the wind here! It turns out that not all properties P(x) lead to sets 
{x : P(x)}. Our explanation was naive. However, formal set theory, which is meant to save us from 
our naiveté, upholds the “principle” (a)-(b) using just a slightly more complicated explanation. 
The reader can see this explanation in our volume 2 in the chapter on cardinality. 
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(n > 1) such that each d; is 


(1) amember of 7, or? 
(2) for some (r + 1)-ary O € .%, O(d;,,...,dj;,,d;) holds, and j; < i for 
| eee a8 


We say that d; is derivable within i steps. 
A derivation of an object A is also called a parse of a. 


ern if d|,..., d, 1s a derivation, then so is d),...,d,, forany 1 <m <n. 


If d is derivable within n steps, it is also derivable in k steps or less, for all 
k > n, since we can lengthen a derivation arbitrarily by adding .7-elements 


to it. & 


1.2.5 Remark. The following metatheorem shows that there is a way to “con- 
struct” CI(7, .#) iteratively, i-e., one element at a time by repeated application 
of the rules. 

This result shows definitively that our inductive definitions of terms (1.1.5) 
and well-formed formulas (1.1.8) fully conform with our working definition of 
theory, as an alphabet and a set of rules that are used to build formulas and 
theorems (p. 5). 


1.2.6 Metatheorem. 


CI(Y, 4%) = {x : x is (F,.%)-derivable within some number of steps, n} 


Proof. For notational convenience let us write 
T = {x : x is (Y,.%)-derivable within some number of steps, 7}. 


As we know from elementary naive set theory, we need to show here both 
Cl(7, .#) € T and Cl(7, .#) D T to settle the claim. 

(C) We do induction on Cl(.7, .#) (using 1.2.3). Now .7 C€ T, since every 
member of .7 is derivable in n = 1 step. (Why?) 

Also, T is closed under every Q in.%. Indeed, let such an (r + 1)-ary Q be 
chosen, and assume 


O(a,,...,a;, b) (i) 


+ This “or” is inclusive: (1), or (2), or both. 
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and {a,,...,a,} CG T. Thus, each a; has a (.7, .#)-derivation. Concatenate all 
these derivations: 
sanded [pwc doy 2s ke Ay 


The above is a derivation (why?). But then, so is 


by (i). Thus, b € T. 

(>) We argue this — that is, “if d € T, thend € Cl(.7,.%)” — by induction 
on the number of steps, 7, in which d is derivable. 

For n = 1 we have d € .Y and we are done, since 7 € Cl(7, .#). 

Let us make the induction hypothesis (I.H.) that for derivations of <n steps 
the claim is true. Let then d be derivable within n + 1 steps. Thus, there is a 
derivation d,,..., dy, d. 

Now, if d € .7, we are done as above (is this a “real case”?) If on the other 
hand Q(a;,,...,a;,,d), then fori = 1,...,r we have aj, € Cl(.7,.#) by the 
LH.; hence d € Cl(.7, .#), since the closure is closed under all O € .%. 


1.2.7 Example. One can see now that N = Cl(7,.#%), where 7 = {0} and .# 
@ contains just the relation y = x + 1 (input x, output y). Similarly, Z, the set 
of all integers, is Cl(7,.#), where 7 = {0} and .% contains just the relations 

y =x-+1and y = x — | (input x, output y). 
For the latter, the inclusion Cl(.7,.#) C€ Z is trivial (by 1.2.3). For > we 
easily see that any n € Zhasa(.7, .#)-derivation (and then we are done by I.2.6). 
For example, ifn > 0, then 0, 1, 2,...,2 is a derivation, while if m <0, then 
0, —1, —2,...,n is one. Ifn = 0, then the one-term sequence 0 is a derivation. 
Another interesting closure is obtained by .7 = {3} and the two relations 
z=x+yandz=x-— y. This is the set {3k : k € Z} (see Exercise I.1). 


Pause. So, taking the first sentence of I.2.7 one step further, we note that we 
have just proved the induction principle for N, for that is exactly what the 
“equation” N = CI(7, .#) says (by I.2.3). Do you agree? 

There is another way to view the iterative construction of Cl(7,.#): The 
set is constructed in stages. Below we are using some more notation borrowed 
from informal set theory. For any sets A and B we write A U B to indicate the 
set union, which consists of all the members found in A or B or in both. More 
generally, if we have a lot of sets, Xo, Xi, X2,..., that is, one X; for every 
integer i > 0 — which we denote by the compact notation (X;);>9 — then we 
may wish to form a set that includes all the objects found as members all over 
the X;, that is (using inclusive, or logical, “or’s below), form 


{x :x € Xp orx € X, or...} 
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or, more elegantly and precisely, 
{x : for some i > 0, x € X;} 


The latter is called the union of the sequence (X;);+9 and is often denoted by 
i>0 xX i or U xX i 
i>0 
Correspondingly, we write 
ie Xx ij or U xX i 
i<n 


if we only want to take a finite union, also indicated clumsily as XyU ... UXp. 


1.2.8 Definition (Stages). In connection with Cl(.7, .#) we define the sequence 
of sets (X;); +0 by induction on n, as follows: 


Xo= TF 


Xn = (U x) 


U f : for some QO €.#% and some a, in U X;, OGn, o| 
i<n 
That is, to form X,,4; we append to (J ee X; all the outputs of all the relations 
in .% acting on all possible inputs, the latter taken from (J i<n Sie 
We say that X; is built at stage i, from initial objects .7 and rule-set .%. 


In words, at stage 0 we are given the initial objects (Xo = .7). At stage 1 we 
© apply all possible relations to all possible objects that we have so far — they 
form the set Xo — and build the Ist stage set, X,, by appending the outputs to 
what we have so far. At stage 2 we apply all possible relations to all possible 
objects that we have so far — they form the set Xg U X, — and build the 2nd 
stage set, X2, by appending the outputs to what we have so far. And so on. 
When we work in the metatheory, we take for granted that we can have 
simple inductive definitions on the natural numbers. The reader is familiar with 
several such definitions, e.g., 


a®=1 (for a ~ 0 throughout) 


q't! =a-q" 
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We will (meta)prove a general theorem on the feasibility of recursive definitions 
later on (1.2.13). 


The following theorem connects stages and closures. & 


1.2.9 Metatheorem. With the X; as in 1.2.8, 


C17, .Z) = U X; 


i>0 
Proof. (C) We do induction on Cl(.7, .#). For the basis, .7 = Xo € Uiso Xj. 

We show that Uso X; is #-closed. Let O € % and O(a, b) hold, for some 
Gy in Uiso X;. Thus, by definition of union, there are integers j), j2,.--, Jn 
such that a; € X;,,i=1,...,n.Ifk = max{j),..., jn}, then a, is in jx Xi: 
hence b € Xx41 U;so Xj. 

(>) It suffices to prove that X,, C C17, .#), a fact we can prove by induction 
on n. Forn = 0 it holds by 1.2.2. As an I.H. we assume the claim for alln < k. 


The case for k + 1: X;,4 is the union of two sets. One is J; -, X;. This is a 


subset of Cl(7, .#) by the LH. The other is 


i<k 


{» : forsome Q €.% and some a in |) X;, OG, »| 


i<k 


This too is a subset of Cl(7, .#), by the preceding observation and the fact that 
CI(Y, #) is #-closed. 


© Worth Saying. An inductively defined set can be built by stages. © 


1.2.10 Definition (Immediate Predecessors, Ambiguity). If d € Cl(7,.#) 
and for some Q and aj,..., 4a, it is the case that O(a,,...,a,,d), then the 
a\,..., a, are immediate Q-predecessors of d, or just immediate predecessors 
if Q is understood; for short, i-p. 

A pair (7, .%) is called ambiguous if some d € Cl(.7,.#) satisfies any (or 
all) of the following conditions: 


(i) It has two (or more) distinct sets of immediate P-predecessors for some 
rule P. 

(ii) It has both immediate P-predecessors and immediate Q-predecessors, for 
PA4Q. 


(iii) It is a member of .7, yet it has immediate predecessors. 


If (7, .#) is not ambiguous, then it is unambiguous. 
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1.2.11 Example. The pair ({00,0}, {Q}), where Q(x, y, z) holds iff z = xy 
(where “xy” denotes the concatenation of the strings x and y, in that order), is 
ambiguous. For example, 0000 has the two immediate predecessor sets {00,00} 
and {0,000}. Moreover, while 00 is an initial object, it does have immedi- 
ate predecessors, namely, the set {0,0} (or, what amounts to the same thing, 


{0}). 


© 1.2.12 Example. The pair (7, .#), where.7 = {3} and.# consists of z = x+y 
and z = x — y, is ambiguous. Even 3 has (infinitely many) distinct sets of i-p. 
(e.g., any {a, b} such that a+ b = 3, ora —b = 3). 
The pairs that effect the definition of Term (1.1.5) and Wff (1.1.8) are un- 
ambiguous (see Exercises I.2 and I.3). 


1.2.13 Metatheorem (Definition by Recursion). Let (.7,.7) be unambiguous 
and C\(7,.%) © A, where A is some set. Let also Y be a set, andi h: 7 > Y 
and gg, for each Q € #, be given functions. For any (r + 1)-ary Q, an input for 
the function go is a sequence (a, bi,..., b,) where a isin A and the by,..., b; 
are allin Y. All the gg yield outputs in Y. 

Under these assumptions, there is a unique function f : CI(7,.2) > Y 


such that 
y=h(x) and x €.7 
af or, for some Q €.%, (1) 


yY = go(X, 01,...,0,) and O(a,,..., a,, x) holds, 
where 0; = f(a;), fori =1,...,7r 


& © The reader may wish to skip the proof on first reading. 


Proof. Existence part. For each (r + 1)-ary QO €.%, define 0 by! 
O(a, 01),-++, (dy, Or), (b, go(b, o1,...,0,))) iff Olay,...,a,b) (2) 
oi any a),...,a,,b, the above definition of O is effected for all possible 
choices of 0;,..., 0, such that go(b, 01, ..., 0,) is defined. © 


Collect now all the O to form a set of rules 7”. 
Let also.7 = {(x, h(x)) : x €.7}. 


+ The notation f : A — Biscommon in informal (and formal) mathematics. It denotes a function 
f that receives “inputs” from the set A and yields “outputs” in the set B. 
t Forarelation Q, writing just““Q(a,,..., a,, b)” is equivalent to writing “Q(a1,..., a,, b) holds”. 
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We will verify that the set F = C7 ; R ) is a 2-ary relation that for every 
input yields at most one output, and therefore is a function. For such a relation 
it is customary to write, letting the context fend off the obvious ambiguity in 
the use of the letter F, 


y= F(x) iff F(x, y) (*) 


We will further verify that replacing f in (1) above by F results in a valid 
equivalence (the “iff” holds). That is, F satisfies (1). 


(a) We establish that F is a relation composed of pairs (x, y) (x is input, y is 
output), where x € Cl(7,.#) and y € Y. This follows easily by induction 
on F (1.2.3), since F C F, and the property (of “containing such pairs”’) 
propagates with each O (recall that the gg yield outputs in Y). 


(b) We next show that “if (x, y) € F and (x, z) € F, then y = z”, that is, F is 
“single-valued” or “well-defined”, in short, it is a function. 

We again employ induction on F, thinking of the quoted statement as a 
“property” of the pair (x, y): 

Suppose that (x, y) € 7, and let also (x, z) € F. 

By 1.2.6, (x, z) cf, or O(a, O1),--+, (dy, Or), (xX, Z)), where 
O(a,,...,a-,x) and z = gg(x, 01,...,0,), for some (r + 1)-ary O and 
(a1, 01),..., (Gy, Oy) In F. 

The right hand side of the italicized “or” cannot hold for an unambiguous 
(7, .%), since x cannot have i.p. Thus (x, z) € ZF: hence y=h(x) =z. 

To prove that the property propagates with each 0," let 


O((a1, 01), .-., (ar, Or), (x, Y)) 


but also 
P (br, 01), ---+ br of) be) 
where O(a),...,a,,X), P(bi,..., bj, x), and 


y = go(x,01,...,0,) and Zep (¥50}..243 0) (3) 


Since (7, .#) is unambiguous, we have Q = P (hence also 0 a P), r=; 
and a; = b; fori = 1,...,r. 
By LH., 0; = oj fori = 1,...,1r; hence y = z by (3). 

(c) Finally, we show that F satisfies (1). We do induction on CUZ. ; R ) to prove: 
(<) Ifx € Yand y = A(x), then F(x, y) (ie., y = F(x) in the 
alternative notation («)), since GF Cc F. Let next y = go(x, 01,...,0,) 
and Q(a,,...,a,,x), where also F(qa;,0;), fori = 1,...,r. By (2), 
O(a, O1),+++5 (dy, Or), (X, ZQ(X, O1,..., 0,))); thus — F being closed 
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under all the rules in. — F(x, go(b, 01,..., 0,)) holds; in short, F(x, y) 
or y = F(x). 
(—) Now we assume that F(x, y) holds and we want to infer the right 
hand side (of iff) in (1). We employ Metatheorem I.2.6. 
Case 1. Let (x, y) be F-derivablet inn = 1 step. Then (x, y) € 7. Thus 
y=h(x). 
Case 2. Suppose next that (x, y) is F-derivable within n + 1 steps, 
namely, we have a derivation 


(X1, V1), (X25 V2), ~-+5 (Xns Yn), (x, y) (4) 
where O((a1, 01),..-, (dr, Or), (x, y))and O(a), ..., ar, x) (see (2)), 
and each of (a1, 01),..., (d;, 0) appears in the above derivation, to 


the left of (x, y). This entails (by (2)) that y = go(x, 01,...,0,). Since 
the (a;, 0;) appear in (4), F(a;, 0;) holds, fori =1,..., r. Thus, (x, y) 
satisfies the right hand side of iff in (1), once more. 


Uniqueness part. Let the function K also satisfy (1). We show, by induction 
on CI(7, .#), that 


For allx € CI(7,. 2) andally €Y, y= F(x) iff y= K(x) (5) 


(>) Letx €.Y, and y = F(x). By lack of ambiguity, the case conditions 
of (1) are mutually exclusive. Thus, it must be that y = h(x). Butthen, y = K(x) 
as well, since K satisfies (1) too. 

Let now Q(q,...,a,, x) and y = F(x). By (1), there are (unique, as we now 
know) 01,..., 0, such that 0; = F(a;) fori=1,...,r, and y=go(x,01,..., 
o,). By the I.H., 0; = K(qa;). But then (1) yields y= K(x) as well (since K 
satisfies (1)). 


(<) Just interchange the letters F and K in the above argument. 


The above clearly is valid for functions h and gg that may fail to be defined 
everywhere in their “natural” input sets. To be able to have this degree of 
generality without having to state additional definitions (such as left fields, 
right fields, partial functions, total functions, nontotal functions, Kleene “weak 
equality’), we have stated the recurrence (1) the way we did (to keep an eye on 
both the input and output side of things) rather than the “usual” 


h(x) ifxe 7 


a) ~ f(a), seey f(@)) if O(a, ago Gs x) holds 


7 CZ, R )-derivable. 
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Of course, if all the gg and h are defined everywhere on their input sets (i.e., 
they are “‘total’’), then f is defined everywhere on CI(7, .%) (see Exercise 1.4). © & 


I.3. Axioms and Rules of Inference 


Now that we have our language, L, we will embark on using it to formally 
effect deductions. These deductions start at the axioms. Deductions employ 
“acceptable” purely syntactic —i.e., based on form, not on meaning — rules that 
allow us to write a formula down (to deduce it) solely because certain other 
formulas that are syntactically related to it were already deduced (i.e., already 
written down). These string-manipulation rules are called rules of inference. 
We describe in this section the axioms and the rules of inference that we will 
accept into our logical calculus and that are common to all theories. 

We start with a precise definition of tautologies in our first order language L. 


1.3.1 Definition (Prime Formulas in Wff. Propositional Variables). A for- 
mula.~4 € Wff is a prime formula or a propositional variable iff it is either of 


Pril. atomic, 
Pri2. a formula of the form ((Ax).4). 


We use the lowercase letters p, g,r (with or without subscripts or primes) to 


denote arbitrary prime formulas (propositional variables) of our language. 


& That is, a prime formula has either no propositional connectives, or if it does, 
it hides them inside the scope of (Ax). 


We may think of a propositional variable as a “blob” that a myopic being 
makes out of a formula described in I.3.1. The same being will see an arbitrary 
well-formed formula as a bunch of blobs, brackets, and Boolean connectives 
(=, V), “correctly connected” as stipulated below.t © 


1.3.2 Definition (Propositional Formulas). The set of propositional formulas 
over 7, denoted here by Prop, is the smallest set such that: 


(1) Every propositional variable (over 7”) is in Prop. 
(2) If. 4 and. are in Prop, then so are (—=.4) and(.4V.#). 


We use the lowercase letters p,q, r (with or without subscripts or primes) to 


denote arbitrary prime formulas (propositional variables) of our language. 


i Interestingly, our myope can see the brackets and the Boolean connectives. 
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1.3.3 Metatheorem. Prop = Wff. 


Proof. (C) We do induction on Prop. Every item in 1.3.2(1) is in Wff. Wff 
satisfies I.3.2(2) (see I.1.8(b)). Done. 


(>) We do induction on Wff. Every item in I.1.8(a) is a propositional variable 
(over 7’), and hence is in Prop. 

Prop trivially satisfies I.1.8(b). It also satisfies I.1.8(c), for if 4 is in Prop, 
then it is in Wff by the C-direction, above. Then, by I.3.1, (Ax). 4) is a propo- 
sitional variable and hence in Prop. We are done once more. 


1.3.4 Definition (Propositional Valuations). We can arbitrarily assign a value 
of 0 or 1 to every .4 in Wff (or Prop) as follows: 


(1) We fix an assignment of 0 or | to every prime formula. We can think of this 
as an arbitrary but fixed function v : {all prime formulas over L} — {0, 1} 
in the metatheory. 


(2) We define by recursion an extension of v, denoted by v: 


((-.4)) = 1-34) 
B(4V B)) = V4) - BP) 


6699 


where “-” above denotes number multiplication. 


We call, traditionally, the values 0 and | by the names “true” and “false” 
respectively, and write t and f respectively. 

We also call a valuation v a truth (value) assignment. 

We use the jargon “. 4 takes the truth value t (respectively, f) under a valuation 
v” to mean “v(.4) = 0 (respectively, (4) = 1)”. 


The above inductive definition of v relies on the fact that Definition 1.3.2 of 
Prop is unambiguous (1.2.10, p. 24), or that a propositional formula is uniquely 
readable (or parsable) (see Exercises I.6 and I.7). It employs the metatheorem 
on recursive definitions (1.2.13). 

The reader may think that all this about unique readability is just an annoying 
quibble. Actually it can be a matter of life and death. The ancient Oracle of 
Delphi had the nasty habit of issuing ambiguous — not uniquely readable, that 
is — pronouncements. One famous such pronouncement, rendered in English, 
went like this: “You will go you will return not dying in the war’.' Given that 
ancient Greeks did not use punctuation, the above has two diametrically opposite 
meanings depending on whether you put a comma before or after “not”. 


+ The original was “Téeig agliéeig ov Ovnéels Ev TOAEL OW”. 
l 
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The situation with formulas in Prop would have been as disastrous in the 
absence of brackets — which serve as punctuation — because unique readability 
would not be guaranteed: For example, for three distinct prime formulas p, g, r 
we could find a v such that 0(p — q — r) is different depending on whether 
we meant to insert brackets around “p — gq” or around “gq — r” (can you find 


such a v?). 


1.3.5 Remark (Truth Tables). Definition I.3.4 is often given in terms of truth- 
functions. For example, we could have defined (in the metatheory, of course) 
the function F_ : {t, f} > {t, f} by 


We could then say that 0((—.4)) = F_.(0(-4)). One can similarly take care of 
all the connectives (V and all the abbreviations) with the help of truth functions 
FY, F,, F_,, F.,. These functions are conveniently given via so-called truth- 


tables as indicated below: 
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F(x) = 


t ifx=f 
f ifx=t 


& 


F_(x) 


F\(x, y) 


Fy(x, y) 


F_,(x, y) 


F(x, y) 


st Pt |< 


ot ot Rh 
ehh ot ot 


at st te 


ot Ph RR 


on oe ond 


oe RR RH ot 


1.3.6 Definition (Tautologies, Satisfiable Formulas, Unsatisfiable Formulas 
in Wff). A formula.4 € Wff (equivalently, in Prop) is a tautology iff for all 


valuations v one has i(. 7) = t. 


We call the set of all tautologies, as defined here, Taut. The symbol taut -4 


says “4 is in Taut”. 


A formula.4 € Wff (equivalently, in Prop) is satisfiable iff for some valu- 


ation v one has 0(.4) = t. We say that v satisfies 4. 


A set of formulas I is satisfiable iff for some valuation v, one has v(.4) = t 


for every.4in TI’. We say that v satisfies I. 


A formula .4 € Wff (equivalently, in Prop) is unsatisfiable iff for all val- 
uations v one has i(.4) = f. A set of formulas I is unsatisfiable iff for all 
valuations v one has o(.4) = f for some .4 in. 


<4 
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1.3.7 Definition (Tautologically Implies, for Formulas in Wff). Let .4 and 
I be respectively any formula and any set of formulas (over L). 

The symbol taut 4, pronounced “IT tautologically implies . 4”, means 
that every truth assignment v that satisfies T also satisfies .4. 


© “Satisfiable” and “unsatisfiable” are terms introduced here in the propositional 
or Boolean sense. These terms have a more complicated meaning when we 
decide to “see” the object variables and quantifiers that occur in formulas. & 


We have at once 


1.3.8 Lemma.' TE tut -4 iff PU{—. 4} is unsatisfiable (in the propositional 
sense). 


er T= @ then rE qaut .Z says just Etaut .Z, since the hypothesis “every truth 
assignment v that satisfies I”, in the definition above, is vacuously satisfied. 
For that reason we almost never write @ Kraut 4 and write instead Eqaut 4. er 


1.3.9 Exercise. For any formula. and any two valuations v and v’, 0(.4) = 
© v’(.4) if v and v’ agree on all the propositional variables that occur in. 7. 

In the same manner, [ Equt .7 is oblivious to v-variations that do not affect 

the variables that occur in I and . 4 (see Exercise I.8). ® 


Before presenting the axioms, we need to introduce the concept of substitu- 
tion. 


1.3.10 Tentative Definition (Substitutions of Terms). Let .4 be a formula, x 
an (object) variable, and ¢ a term. .4[x < ft] denotes the result of “replacing” 
all free occurrences of x in .4 by the term ft, provided no variable of t was 
“captured” (by a quantifier) during substitution. 


+ The word “lemma” has Greek origin, “Ajj.a”’, plural “lemmata” (some people say “lemmas’’) 
from “Ajpupwata”. It derives from the verb “AwpuBavw” (to take) and thus means “taken thing”. 
In mathematical reasoning a lemma is a provable auxiliary statement that is taken and used as 
a stepping stone in lengthy mathematical arguments — invoked therein by name, as in “... by 
Lemma such and such...” — much as “subroutines” (or “procedures”) are taken and used as 
auxiliary stepping stones to elucidate lengthy computer programs. Thus our purpose in having 
lemmata is to shorten proofs by breaking them up into modules. 
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If the proviso is valid, then we say that “‘t is substitutable for x (in. 4)”, or 
that “t is free for x (in. 4)”. If the proviso is not valid, then the substitution is 


undefined. 


1.3.11 Remark. There are a number of issues about Definition I.3.10 that need 
discussion or clarification. 

Reasonable people will be satisfied with the above definition “as is”. How- 
ever, there are some obscure points (enclosd in quotation marks above). 


(1) What is this about “capture”? Well, suppose that .4 = (Ax)-x = y. Let 
t=x.i Then. 4[y < t] = (Ax)-x = x, which says something altogether 
different than the original. Intuitively, this is unexpected (and undesirable): 
_@ codes a statement about the free variable y, i.e., a statement about all 


objects which could be “values” (or meanings) of y. One would have ex- 


pected that, in particular,.4ZL.y < x] — if the substitution were allowed — 


would make this very same statement about the values of x. It does not. 


What happened is that x was captured by the quantifier upon substitution, 
thus distorting .4’s original meaning. 
(2) Are we sure that the term “replace” is mathematically precise? 


(3) Is. 4[x <1] always a formula, if .7 is? 


A re-visitation of 1.3.10 via an inductive definition (by induction on terms 
and formulas) settles (1)—(3) at once (in particular, the informal terms “replace” 


and “capture” do not appear in the inductive definition). We define (again) the 


symbol .4[x < tf], for any formula .%, variable x, and term 1, this time by 


induction on terms and formulas: 


First off, let us define s[x < t], where s is also a term, by cases: 


s[x <ft]= 


trix <tirlx <t]...rn[x <1] 


ifs=x 

if s = a, aconstant 
(symbol) 

ifs = y,avariable 4 x 

ifs = fry...Th 


Pause. Is s[x < t] always aterm? That this is so follows directly by induction 
on terms, using the definition by cases above and the I.H. that each of r;[x < 1], 


i=1,...,n,isaterm. 


+ Recall that in 1.1.4 (p. 13) we defined the symbol “=” to be equality on strings. 
= The original says that for any object y there is an object that is different from it;. Z[y < x] says 
that there is an object that is different from itself. 
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We turn now to formulas. The symbols P, r, s (with or without subscripts) 
below denote a predicate of arity n, a term, and a term (respectively): 


s[x <—tl=r[x <f¢] if.4=s=r 
Prilx <—tln[x <t]... if 4=Pr,...r, 
rnlx < ft] 
Gene (Bix <-t]v @[x <t]) if. 4=(2V@) 
~ | (“C8 [x < t])) if. 4 = (77) 
4 if. 4 = (Gy).#7) and y=x 
((Ay)( Ax < t])) if. 4 = (Gy).#) and y#x 


and y does not occur in ¢ 


In all cases above, the left hand side is defined iff the right hand side is. 


Pause. We have eliminated “replaces” and “captured”. But is 4[x < ft] a for- 
mula (whenever it is defined)? (See Exercise I.9) © 


1.3.12 Definition (Simultaneous Substitution). The symbol 

MY 1, 065 Vr <—t,...,t] 
or, equivalently, .A[y, < f,] — where y, is an abbreviation of y,,..., y, — 
denotes simultaneous substitution of the terms f,,...,¢, into the variables 


yi,..., yy in the following sense: Let z, be variables that do not occur at all 
(either as free or bound) in any of . 4, t,. Then ALY, <— t,] is short for 


Ay <—a)...br < ella <n)... fe < te] (1) 


© Exercise I.10 shows that we obtain the same string in (1) above, regardless of 
our choice of new variables Z,. 


More Conventions. The symbol [x < tf] lies in the metalanguage. This 
metasymbol has the highest priority, so that, e.g.,.4V .2[x < t] means 
AN (Bix <— t]), Ax).2[x < t] means (Ax)(.A[x < 1]), ete. 

The reader is reminded about the conventions regarding the metanotations 


_A(x,] and. A(x,) (see 1.1.11). In the context of those notations, if f;,..., f, are 
terms, the symbol. 4[t;, ..., t,] abbreviates .Z[},, < 1,]. & 


We are ready to introduce the (logical) axioms and rules of inference. 
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Schemata.' Some of the axioms below will actually be schemata. A formula 
@ schema, or formula form, is a string ¥ of the metalanguage that contains syn- 
tactic variables, such as.4, P, f,a,t, x. 

Whenever we replace all these syntactic variables that occur in ¥ by specific 
formulas, predicates, functions, constants, terms, or variables respectively, we 
obtain a specific well-formed formula, a so-called instance of the schema. For 
example, an instance of (Ax)x = a is (Av12)v12 = O (in the language of Peano 
arithmetic). An instance of .4 > .4 is vjo1 = V114 > VIO) = VII. © 


1.3.13 Definition (Axioms and Axiom Schemata). The logical axioms are all 
the formulas in the group Ax1 and all the possible instances of the schemata in 
the remaining groups: 


Axl. All formulas in Taut. 
Ax2. (Schema) 


Ax <— t] > (Ax).4 for any term ft 
& By 13.10-1.3.11, the notation already imposes a condition on t, that it is 
substitutable for x. © 
N.B. We often see the above written as 
Alt] > (Ax) 4[x] 
or even 
Alt] > (Ax). 4 


Ax3. (Schema) For each object variable x, the formula x = x. 
Ax4. (Leibniz’s characterization of equality — first order version. Schema) For 
any formula .4, object variable x, and terms ¢ and s, the formula 


t=s—> (4x <t]o.4[x < s]) 


N.B. The above is written usually as 


t=s—>(4t] o.A[s]) 


We must remember that the notation already requires that t and s be free 
for x. 


We will denote the above set of logical axioms by A. 


+ Plural of schema. This is of Greek origin, o x jwa, meaning — e.g., in geometry — figure or 
configuration or even formation. 
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The logical axioms for equality are not the strongest possible, but they are 

Oe adequate for the job. What Leibniz really proposed was the schema t = s 
(VP)(P[t] < P[s]), which says, intuitively, that “two objects ¢ and s are equal 
iff, for every ‘property P’, both have P or neither has P”. 

Unfortunately, our system of notation (first-order language) does not allow 
quantification over predicate symbols (which can have as “values” arbitrary 
“properties”’). But is not Ax4 read “for all formulas .4” anyway? Yes, but with 
one qualification: “For all formulas .4 that we can write down in our system of 
notation’, and, alas, we cannot write all possible formulas of real mathematics 
down, because they are too many.! 

While the symbol “=” is suggestive of equality, it is not its shape that 
qualifies it as equality. It is the two axioms, Ax3 and Ax4, that make the symbol 
behave as we expect equality to behave, and any other symbol of any other 
shape (e.g., Enderton (1972) uses “~”’) satisfying these two axioms qualifies as 
formal equality that is intended to codify the metamathematical standard “=”. cr ® 


1.3.14 Remark. In Ax2 and Ax4 we imposed the condition that ¢ (and s) must 
be substitutable in x. Here is why: 


Take . 4 to stand for (Vy)x = y and.# to stand for (d4y)>x = y. Then, tem- 
porarily suspending the restriction on substitutability, 4[x <— y]— (Ax).4 is 


(Vy)y =y > (Ax)(Vy)x = y 


andx =y > (7 <.24[x < y))is 


x=y— (Gy)-x = y @ Gy)>y= y) 
neither of which, obviously, is “valid” 


There is a remedy in the metamathematics: Move the quantified variable(s) 
out of harm’s way, by renaming them so that no quantified variable in .4 has 
the same name as any (free, of course) variable in ¢ (or s). 

This renaming is formally correct (i.e., it does not change the meaning of 
the formula), as we will see in the variant (meta)theorem (I.4.13). Of course, 


+ 


“Uncountably many”, in a precise technical sense developed in the chapter on cardinality in 
volume 2 (see p. 62, of this volume for a brief informal “course” in cardinality). This is due to 
Cantor’s theorem, which implies that there are uncountably many subsets of N. Each such subset 
A gives rise to the formula x € A in the metalanguage. 

On the other hand, set theory’s formal system of notation, using just € and U as start-up 
(nonlogical) symbols, is only rich enough to write down a countably infinite set of formulas 
(cf. p. 62). Thus, our notation will fail to denote uncountably many “real formulas” x € A. 
Speaking intuitively is enough for now. Validity will be defined carefully pretty soon. 


++ 
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it is always possible to effect this renaming, since we have countably many 
variables, and only finitely many appear free in t (and s) and .4. This trivial 
remedy allows us to render the conditions in Ax2 and Ax4 harmless. Essentially, 
at (ors) is always substitutable after renaming. 


It is customary to assume a Platonist metatheory, and we do so. We can then 
say “countably many” variables without raising any eyebrows. Alternatively, 
we know how to get a new variable that is different from all those in a given 
finite set of variables without invoking an infinite supply. 


1.3.15 Definition (Rules of Inference). The following are the two rules of 
inference. These rules are relations in the sense of Section I.2, with inputs from 
the set Wff and outputs also in Wff. They are written traditionally as “fractions”. 
We call the “numerator” the premise(s) and the “denominator” the conclusion. 


We say that a rule of inference is applied to the formula(s) in the numerator, 
and that it yields (or results in) the formula in the denominator. 


Infl. Modus ponens, or MP. For any formulas .4 and.7, 


Inf2. S-introduction — pronounced E-introduction. For any formulas .4 and.7 
such that x is not free in 2, 


(Ax).47> # 


N.B. Recall the conventions on eliminating brackets! 


It is immediately clear that the definition above meets our requirement that the 
rules of inference be “algorithmic”, in the sense that whether they are applicable 
or how they are applicable can be decided and carried out in a finite number 
of steps by just looking at the form of (potential input) formulas (not at the 
“meaning” of such formulas). 


We next define I’-theorems, that is, formulas we can prove from the set of 
formulas I" (this P may be empty). 


1.3.16 Definition (C'-Theorems). The set of I’-theorems, Thmr, is the least 
inclusive subset of Wff that satisfies: 


Thl. A C Thmr (cf. 1.3.13). 
Th2.  C Thmr. We call every member of I’ a nonlogical axiom. 
Th3. Thmry is closed under each rule Inf1-Inf2. 
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The metalinguistic statement .4 € Thmr is traditionally written as T + .4, 
and we say that .4 is proved from T or that it is a P-theorem. 

We also say that .4 is deduced by I, or that deduces .4. 

If = G, then rather than 6 + .4 we write F . 4. We often say in this case 
that .4 is absolutely provable (or provable with no nonlogical axioms). 


We often write.4,.7,...,H @ for {4,.B,..., DFE. 


1.3.17 Definition (['-Proofs). We just saw that Thmry is Cl(7, .%), where .7 is 
the set of all logical and nonlogical axioms, and .% contains just the two rules 
of inference. An (.7,.#)-derivation is also called a ’-proof (or just proof, if 
is understood). 


1.3.18 Remark. (1) It is clear that if each of .4),... ,.4, has a '-proof and 
B has an {.41,... ,-4,}-proof, then .2 has a I’-proof. Indeed, simply con- 
catenate all of the given I’-proofs (in any sequence). Append to the right of that 
sequence the given {.41,... ,.4,}-proof (that ends with .7). Then the entire 
sequence is a ’-proof, and ends with .7. 


We refer to this phenomenon as the transitivity of F. 


N.B. Transitivity of allows one to invoke previously proved (by him or 
others) theorems in the course of a proof. Thus, practically, a T'-proof is a 
sequence of formulas in which each formula is an axiom, is aknown I’-theorem, 
or is obtained by applying a rule of inference on previous formulas of the 
sequence. 

(2)Iff C AandI + .4, then also A + .4, as follows from I.3.16 or 1.3.17. 
In particular, + .4 implies [ + . 4 for any I. 


(3) It is immediate from the definitions that for any formulas .4 and .7, 
4, 62> BY B (i) 
and if, moreover, x is not free in .7, 


46> Br (Ax)4> B (ii) 


Some texts (e.g., Schiitte (1977)) give the rules in the format of (i)-(ii) above. 


4 


The axioms and rules provide us with a calculus, that is, a means to “cal- 
culate” proofs and theorems. In the interest of making the calculus more user- 
friendly — and thus more easily applicable to mathematical theories of interest, 
such as Peano arithmetic or set theory — we are going to develop in the next 
section a number of derived principles. These principles are largely of the form 
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.4,...,.46n FB. We call such a (provable in the metatheory) principle a de- 
rived rule of inference, since, by transitivity of F , it can be used as a proof-step 
in a I’-proof. By contrast, the rules Infl—Inf2 are “basic” or “primary”; they 
are given outright. 


We can now fix our understanding of the concept of a formal or mathematical 
theory. 

A (first order) formal (mathematical) theory over a language L, or just theory 
over L, or just theory, is a tuple (of “ingredients”) T = (L, A, I,.7 ), where 
L is a first order language, A is a set of logical axioms, I is a set of rules of 
inference, and.7 a non-empty subset of Wff that is required to contain A (i.e., 
A C.Z/ ) and be closed under the rules I. 

Equivalently, one may simply require that .7 is closed under |, that is, for 
anyT C.¥ and any formula.4, if [ + .4, then. 4 € .7. This is, furthermore, 
equivalent to requiring that 


AEF i FLA (1) 


Indeed, the if direction follows from closure under, while the only if direction 
is a consequence of Definition 1.3.16. 

7 is the set of the formulas of the theory,' and we often say “a theory .7’”, 
taking everything else for granted. 

If .7 = Wff, then the theory is called inconsistent or contradictory. Oth- 
erwise it is called consistent. 


Throughout our exposition we fix A and Tas in Definitions I.3.13 and 1.3.15. 


By (1),.7 = Thm. This observation suggests that we call theories such as 
the ones we have just defined axiomatic theories, in that a set I always exists 
so that.7 = Thmry (if at a loss, we can just take Pr = .7 ). 


We are mostly interested in theories { for which there is a “small” set T’ 
(“small” by comparison with .7 ) such that .7 =Thmr. We say that T is 
axiomatized by 1. Naturally, we call .7 the set of theorems, and T the set of 
nonlogical axioms of %. 


If, moreover, I" is recognizable (i.e., we can tell “algorithmically” whether 
or not a formula. 4 is in T’), then we say that { is recursively axiomatized. 

Examples of recursively axiomatized theories are ZFC set theory and Peano 
arithmetic. On the other hand, if we take .7 to be all the formulas of arithmetic 
that are true when interpreted “in the intended way’? over N — the so-called 


t As opposed to “of the language”, which is all of WE. 
* That is, the symbol “0” of the language is interpreted as the 0 € N, “Sx” as x + 1, “(Ax)” as 
“there is an x € N”, etc. 
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complete arithmetic — then there is no recognizable I such that.7 = Thmr. 
We say that complete arithmetic is not recursively axiomatizable.' 


Pause. Why does complete arithmetic form a theory? Because work of Sec- 
tion I.5 — in particular, the soundness theorem — entails that it is closed under F. 


We tend to further abuse language and call axiomatic theories by the name 
of their (set of) nonlogical axioms I’. Thus if T = (L, A, I,.7 ) is a first order 
theory and.7 = Thm, then we may say interchangeably “theory { ”, “theory 
ZF” or “theory I”. 

If = @, then we have a pure or absolute theory (i.e., we are “just doing 
logic, not math”). If [ 4 Y, then we have an applied theory. cr 


Argot. A final note on language versus metalanguage, and theory versus 
metatheory. When are we speaking the metalanguage and when are we speaking 
the formal language? 


The answer is, respectively, “almost always” and “almost never”. As it 
has been remarked before, in principle, we are speaking the formal language 
exactly when we are pronouncing or writing down a string from Term or Wff. 
Otherwise we are (speaking or writing) in the metalanguage. It appears that we 
(and everybody else who has written a book in logic or set theory) is speaking 
and writing within the metalanguage with a frequency approaching 100%. 

The formalist is clever enough to simplify notation at all times. We will 
seldom be caught writing down a member of Wff in this book, and, on the rare 
occasions we may do so, it will only be to serve as an illustration of why one 
should avoid writing down such formulas: because they are too long and hard 
to read and understand. 

We will be speaking the formal language with a heavy “accent” and using 
many “idioms” borrowed from “real” (meta)mathematics and English. We will 
call our dialect argot, following Manin (1977). 

The important thing to remember is when we are working in the theory,! and 
this is precisely when we generate theorems. That is, it does not matter if a theo- 
rem (and much of the what we write down during the proof) is written in argot. 


Two examples: 


(1) One is working in formal number theory (or formal arithmetic) if one states 
and proves (say, from the Peano axioms) that “every natural number n > 1 


t The trivial solution — that is, taking T = .7 — will not do, for it turns out that .7 is not 
recognizable. 

t Important, because arguing in the theory restricts us to use only its axioms (and earlier proved 
theorems; cf. [.3.18) and its rules of inference — nothing extraneous to these syntactic tools is 
allowed. 


ee? 
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has a prime factor”. Note how this theorem is stated in argot. Below we 
give its translation into the formal language of arithmetic:! 


(Wn)(SO <n > (Ax)Gy)n =x x yA 
SO<xA(Wm)\Vr\(x=mxr>m=S0Vm=x))) 


() 


(2) One is working in formal logic if one is writing a proof of (4v13)v13 = v13. 


Suppose though that our activity consists of effecting definitions, introducing 
axioms, or analyzing the behaviour or capability of T, e.g., proving some derived 
tule.41,...,.4, / .@ — that is, a theorem schema — or investigating consis- 
tency! or “relative consistency”. Then we are operating in the metatheory, 
that is, in “real” mathematics. 


One of the most important problems posed in the metatheory is 
“Given a theory and a formula... Is.4 a theorem of {?” 


This is Hilbert’s Entscheidungsproblem, or decision problem. Hilbert be- 
lieved that every recursively axiomatized theory ought to admit a “general” 
solution, by more or less mechanical means, to its decision problem. The tech- 
niques of Gédel and the insight of Church showed that this problem is, in 
general, algorithmically unsolvable. 


As we have already stated (p. 36), metamathematics exists outside and in- 
dependently of our effort to build this or that formal system. All its methods 
are — in principle — available to us for use in the analysis of the behaviour of a 
formal system. 


Pause. But how much of real mathematics are we allowed to use, reliably, to 
study or speak about the “simulator” that the formal system is? For example, 
have we not overstepped our license by using induction (and, implicitly, the 
entire infinite set N) in our Platonist metatheory, specifically in the recursive or 
inductive definitions of terms, well-formed formulas, theorems, etc.? 


The quibble here is largely “political”. Some people argue (a major propo- 
nent of this was Hilbert) as follows: Formal mathematics was meant to crank 
out “true” statements of mathematics, but no “false” ones, and this freedom 


— 


Well, almost. In the interest of brevity, all the variable names used in the displayed formula (1) 
are metasymbols. 

t That is, whether or not.7 = Wff. 

That is, “if I is consistent” — where we are naming the theory by its nonlogical axioms — “does 
it stay so after we have added some formula .4 as a nonlogical axiom?” 

The methods or scope of the metamathematics that a logician uses — in the investigation of some 
formal system — are often restricted for technical or philosophical reasons. 


wm 
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from contradiction ought to be verifiable. Now, as we are so verifying in the 
metatheory (i.e., outside the formal system) shouldn’t the metatheory itself be 
“above suspicion” (of contradiction, that is)? Naturally. 


Hilbert’s suggestion for achieving this “above suspicion” status was, essen- 
tially, to utilize in the metatheory only a small fragment of “reality” that is 
so simple and close to intuition that it does not need itself a “certificate” (via 
formalization) for its freedom from contradiction. In other words, restrict the 
metamathematics.' Such a fragment of the metatheory, he said, should have 
nothing to do with the infinite, in particular with the entire set N and all that it 
entails (e.g., inductive definitions and proofs).! 

If it were not for Gédel’s incompleteness results, this position — that meta- 
mathematical techniques must be finitary — might have prevailed. However, 
Gédel proved it to be futile, and most mathematicians have learnt to feel com- 
fortable with infinitary metamathematical techniques, or at least with N and 
induction.’ Of course, it would be reckless to use as metamathematical tools 
“mathematics” of suspect consistency (e.g., the full naive theory of sets). 


It is worth pointing out that one could fit (with some effort) our inductive 
definitions within Hilbert’s style. But we will not do so. First, one would have 
to abandon the elegant (and now widely used) approach with closures, and use 
instead the concept of derivations of Section I.2. Then one would somehow 
have to effect and study derivations without the benefit of the entire set N. 
Bourbaki (1966b, p. 15) does so with his constructions formatives. Hermes 
(1973) is another author who does so, with his “term-” and “formula-calculi” 
(such calculi being, essentially, finite descriptions of derivations). 

Bourbaki (but not Hermes) avoids induction over all of N. In his metamath- 
ematical discussions of terms and formulas‘ that are derived by a derivation 


+ 


Otherwise we would need to formalize the metamathematics — in order to “certify” it — and 
next the metametamathematics, and so on. For if “metaM” is to authoritatively check “M” for 
consistency, then it too must be consistent; so let us formalize ““metaM” and let ““metametaM” 
check it; ... a never ending story. 

See Hilbert and Bernays (1968, pp. 21-29) for an elaborate scheme that constructs “concrete 
number objects” — Ziffern or “numerals” — “|”, “||”, “|||”, etc., that stand for “1”, “2”, “3”, etc., 
complete with a “concrete mathematical induction” proof technique on these objects, and even 
the beginnings of their recursion theory. Of course, at any point, only finite sets of such objects 
were considered. 


++ 


wr 


Some proponents of infinitary techniques in metamathematics have used very strong words 
in describing the failure of “Hilbert’s program”. Rasiowa and Sikorski (1963) write in their 
introduction: “However Gédel’s results exposed the fiasco of Hilbert’s finitistic methods as far 
as consistency is concerned.” 

For example, in loc. cit., p. 18, where he proves that, in our notation,.Z[x < y] and t[x < y] 
are a formula and term respectively. 


= 
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d,,...,dn, he restricts his induction arguments on the segment {0, 1,..., 7}, 
that is, he takes an ILH. on k < n and proceeds to k + 1. OS 


1.4. Basic Metatheorems 


We are dealing with an arbitrary theory T = (L, A,I,.7 ), such that A is the 
set of logical axioms (1.3.13) and I are the inference rules (1.3.15). We also let 
I’ be an appropriate set of nonlogical axioms, i.e.,.7 = Thmr. 


1.4.1 Metatheorem (Post’s “Extended” Tautology Theorem). /f.7,,..., 
nF tant 2 then. %1,...,.-B, FB. 


Proof. The assumption yields that 
Eqaut 71 > ++: 2.4, > 2B (1) 
Thus, since the formula in (1) is in A, using Definition I.3.16, we have 


1,+++,. Oy F Al > + > 4, > B (2) 


Applying modus ponens to (2) n times, we deduce .7. 
& 1.4.1 is an omnipresent derived rule. 
1.4.2 Definition. .4 and. provably equivalent in f means that + .4 <> .Z. 


1.4.3 Metatheorem. Any two theorems .# and .& of ¥ are provably equivalent 
in ©, 


Proof. By 1.4.1, - .4 yields T + .2 > .#. Similarly, lr + .% yields 
lt.4—> .£#. One more application of 1.4.1 yields PF .4 0.2. 


& Worth noting: | ~x = x < ay = y (why?), but neither —x = x nor ~y = y 


is a @-theorem. ® 


1.4.4 Remark (Hilbert Style Proofs). In practice we write proofs “vertically”, 
that is, as numbered vertical sequences (or lists) of formulas. The numbering 
helps the annotational comments that we insert to the right of each formula that 
we list, as the following proof demonstrates. 
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A metatheorem admits a metaproof, strictly speaking. The following is a 
derived rule (or theorem schema) and thus belongs to the metatheory (and so 
does its proof). 

Another point of view is possible, however: The syntactic symbols x, .4, 
and .# below stand for a specific variable and specific formulas that we just 
forgot to write down explicitly. Then one can think of the proof as a (formal) 


Hilbert style proof. © 
1.4.5 Metatheorem (V-Introduction — Pronounced “A-Introduction’’). [f x 
does not occur free in 7%, then. 4—> BL 4—> (Vx)B. 
Proof. 

(1) #4>.f given 

(2) ABZ A4 (1) and 1.4.1 

(33) (Ax)7A#R>A4 (2) and 3-introduction 

(4) 47> 7A(x)7.# (3) and 1.4.1 

(5) .4> (Wx).B (4), introducing the V-abbreviation 


1.4.6 Metatheorem (Specialization). For any formula .4 and term t, 
F (VWx).4—> .4[f]. 


© At this point, the reader may want to review our abbreviation conventions, in 


particular, see Ax2 (1.3.13). 
Proof. 

qd) -74[t] ~ Gx)-4 inA 

(2) 7A(ax)74-> .4[t] (1) and 1.4.1 

(3) (Wx) 4 >-4[t] (2), introducing the V-abbreviation 


Y 


1.4.7 Corollary. For any formula 7%, (Wx).4—> 4. 


Proof. [x <—x]=.4%. 


Pause. Why is.4[x < x] the same string as .4? 


1.4.8 Metatheorem (Generalization). For any T and any .4, if + .4, then 
TF (Wx). 4. 
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Proof. Choose y # x. Then we continue any given proof of .4 (from I’) as 


follows: 
(1) .4 proved from P 
(2) y=y7>.4 (1) and 1.4.1 
3) y=y—> (Wx).4 (2) and V-introduction 
(4) y=y in A 
(5) (Wx).4 (3), (4), and MP 


1.4.9 Corollary. For any T and any .%, TF .4ifff + (Vx). 


Proof. By 1.4.7, 1.4.8, and modus ponens. 


1.4.10 Corollary. For any .4, 4+ (Wx).4@ and (Wx). 4+ 4. 


The above corollary motivates the following definition. It also justifies the 
common mathematical practice of the “implied universal quantifier”. That is, 


” 


we Often state “...x...” when we mean “(Vx)...x%...”. 


1.4.11 Definition (Universal Closure). Let y,,..., y, be the list of all free vari- 
ables of .4. The universal closure of 4 is the formula (Vy,)(Vy2) +++ (Wyn) 4- 
often written more simply as (Vy; y2... yn)-4 or even (Vy,).4. 


& By 1.4.10, a formula deduces and is deduced by its universal closure. © 


Pause. We said the universal closure. Hopefully, the remark immediately above 
is undisturbed by permutation of (Vy,)(Vy2)--- (Vy,). Is it? (Exercise I.11). 


1.4.12 Corollary (Substitution of Terms). .Z[x,,...,x,] 1 -4[t,..., t] for 
any terms ty, ..., ty. 


& The reader may wish to review I.3.12 and the remark following it. © 


Proof. We illustrate the proof for n = 2. What makes it interesting is the re- 
quirement to have “simultaneous substitution”. To that end we first substitute 
into x; and x2 new variables z, w —i.e., not occurring in either .4 or in the f;. 
The proof is the following sequence. Comments justify, in each case, the pres- 
ence of the formula immediately to the left by virtue of the presence of the 
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immediately preceding formula. 


A[X, X2] starting point 
(Wx1).- 4x1, x2] generalization 
AZ, x7] specialization; x; < z 
(Wx). 4[z, x2] generalization 
Alz, w] specialization; x. <— w 


Now z < t1, Ww < fg, in any order, is the same as “simultaneous substitu- 
tion 1.3.12”: 


(Vz). 4[z, w] generalization 
“Alt, w] specialization; z < fy 
(Vw). 4[t, w] generalization 
Alt, tr] specialization; w <— fh 


1.4.13 Metatheorem (The Variant, or Dummy-Renaming, Metatheorem). 
For any formula (Ax). 4, if z does not occur in it (i.e., is neither free nor bound), 
then (Ax).4 <— (Az). 4[x < Zz]. 


We often write this (under the stated conditions) as + (4x). 4[x] — (dz). 4[z]. 
By the way, another way to state the conditions is “if z does not occur in .4 
(i.e., is neither free nor bound in .4), and is different from x’. Of course, if 
z = x, then there is nothing to prove. © 


Proof. Since z is substitutable in x under the stated conditions, .4[x < z] is 
defined. Thus, by Ax2, 


+ 4[x <— z] > (Ax).4 
By 3-introduction — since z is not free in (Ax). 4 — we also have 
F (Az)4[x < z] > (Ax).4 (1) 


We note that x is not free in (Az). 4[x < z] and is free for z in. 4[x < z]. Indeed, 
Alx <— z][z < x] =.4. Thus, by Ax2, 


+ .4 > (Az).4lx < Zz] 
Hence, by i-introduction, 


F (Ax).4 > (Az). 4x < z] (2) 


Tautological implication from (1) and (2) concludes the argument. 
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Why is.4[x <— z][z < x] =-4? We can see this by induction on .4 (recall 
that z occurs as neither free nor bound in. 4). 

If .4 is atomic, then the claim is trivial. The claim also clearly “propagates” 
with the propositional formation rules, that is, I.1.8(b). 

Consider then the case that .4 = (dw).%. Note that w = x is possible 
under our assumptions, but w = z is not. If w = x, then. 4[x < z] =.4; in 
particular, z is not free in. 4; hence .4[x < z][z << x] =.4as well. 

So let us work with w 4 x. By LH., .2[x <— z][z <— x] =.%. Now 


Ox <— z][z — x] = (Gw)A)[x < z][z <— x] 
= (Gw).A[x <— zp)[z <— x] seel3.11;w#z 
= ((Aw).2[x < z][z < x]) see 1.3.11; w #x 
((Aw).%) LH. 
=.4 


By 1.4.13, the issue of substitutability becomes moot. Since we have an infinite 
© supply of variables (to use, for example, as bound variables), we can always 
change the names of all the bound variables in .4 so that the new names are 
different from all the free variables in. 4 or f. In so doing we obtain a formula 
.# that is (absolutely) provably equivalent to the original. 
Then .#[x < tf] will be defined (t will be substitutable in x). Thus, the 
moral is: any term ¢ is free for x in. 4 after an appropriate ‘dummy’ renaming. 
By the way, this is one of the reasons we want an infinite supply (or an 
extendible finite set, for the finitist) of formal variables. 


1.4.14 Definition. In the following we will often discuss two (or more) theories 
at once. Let T = (L, A, I,.7 ) and {’ = (L’, A, I, .7’) be two theories, such 
that 7 C HY". This enables {’ to be “aware” of all the formulas of T (but not 
vice versa, since L’ contains additional nonlogical symbols). 

We say that {’ is an extension of T (in symbols, T < V ) iff.7 C.F". 

Let.4be a formula over L (so that both theories are aware of it). The symbols 
Fe. éandt<z.4 are synonymous with. 4 € .7and.4 €.7’ respectively. 

Note that we did not explicitly mention the nonlogical axioms I or I’ to the 
left of -, since the subscript of F takes care of that information. 


We say that the extension is conservative iff for any .4 over L, whenever 
Fg, .@ it is also the case that F< .4. That is, when it comes to formulas over 
the language (L) that both theories understand, then the new theory does not 
do any better than the old in producing theorems. 


1.4.15 Metatheorem (Metatheorem on Constants). Let us extend a language 
L of a theory & by adding new constant symbols €1,..., €n to the alphabet 7, 
resulting in the alphabet 7’, language L', and theory &'. 
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Furthermore, assume that ' = I, that is, we did not add any new nonlogical 


axioms. 

Thenk gs .4[e1,..., €n]impliest¢ -4[x1,..., Xn] forany variables x,,..., 
Xn that occur nowhere in.4[e 1, ..., én], aS either free or bound variables. 
Proof. Fix aset of variables x, ..., x, as described above. We do induction on 


¥’-theorems. 


Basis. 4[e1, ..., @n] is alogical axiom (over L’); hence so is. 4[x1,..., Xn], 
over L — because of the restriction on the x; — thus F< .4[x1,..., x, ]. Note 
that .Z[e),..., @,] cannot be nonlogical under our assumptions. 


Pause. What does the restriction on the x; have to do with the claim above? 


Modus ponens. Here fe .@le,...,€n] > -4le1,..-,e,] and Fe 
Bley,..-,@)]. By LH, Fe &?hy,.--, yy] > -4D1,---, yn] and Fe 


Bilyi,--.,¥n], where y,,...,¥, occur nowhere in .#[e1,...,é@,] > 
#O[e1,...,€n] as either free or bound variables. By modus ponens, 
0[y1,-++, Yn]; hence ke -4[x1,..., X,] by 1.4.12 (and 1.4.13). 
d-introduction. We haves .#[e1,...,€n] > @[e1,..-, en], Z is not free 
in @[e,,...,e,], and. 4[e,,...,e,] = Gz)Bles,...,e,] ~ ley,..., ey]. 
By LH., if w,,..., w, — distinct from z — occur nowhere in. [e1,..., €n] > 
@ [e1,-.-,@,] as either free or bound, then we get Fy .#[w1,..., Wn] > 
2 W1,---,;Wn]. By d-introduction we get Fe (4z).#[w1,...,Wn] > 


[ 
@[wi,..., Wr]. By 14.12 and 1.4.13 we get Fe (4z).A[x,...,%n] > 
@[m,-- Xn], L.e., re OX, +++, Xn). 


1.4.16 Corollary. Let us extend a language L of a theory & by adding new 
constant symbols e;,..., @, to the alphabet Y, resulting to the alphabet 7, 
language L', and theory &. 

Furthermore, assume that T' = I, that is, we did not add any new nonlogical 
axioms. 

Then .4[e1,..., en] iff ke 4x, ..., Xn], for any choice of variables 


X1,.++,Xy- 


Proof. If part: Trivially, F< .4[x1,..., Xn] implies ke .4[x1,..., Xn], hence 
Fe .@fe1,..., en] by 14.12. 

Only-if part: Choose variables y,,...,y, that occur nowhere in 
#[e1,..., en] as either free or bound. By 1.4.15, F< -4[y1,..., yn]; hence, 
by 14.12 and 1.4.13, <¢.4[x1,..., Xn]. 


1.4.17 Remark. Thus, the extension ©’ of T is conservative, for, if .4 is over 
L, then .4[e,..., @n] = .%. Therefore, if Fs .4, then Fy .4[e),..., en]; 
hence Fy .4[x1,..., Xp], that is, Fe 4. 


4 
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A more emphatic way to put the above is this: £’ is not aware of any new 
nonlogical facts that did not already “know” although by a different name. If 
Y' can prove .4[e1,..., en], then T can prove the same “statement”, however, 
using (any) names (other than the e;) that are meaningful in its own language; 
namely, it can prove .4[x1,..., Xn]. 


The following corollary stems from the proof (rather than the statement) 
of 1.4.15 and 1.4.16, and is important. 


1.4.18 Corollary. Let e,,..., e, be constants that do not appear in the nonlog- 
ical axioms T’. Then, if x1, ..., Xn are any variables, and ifT | 4[e1,..., en], 
it is also the case thatT | .4[x1,..., Xn]. 


1.4.19 Metatheorem (The Deduction Theorem). For any closed formula . 4, 
arbitrary formula #2, and set of formulas 1, iff +.4+ .#, thenl + .4@ > .2. 


N.B. I +.4 denotes the augmentation of I by adding the formula. 7. In the 
present metatheorem . 4 is a single (but unspecified) formula. However, the no- 
tation extends to the case where .4 is a schema, in which case it means the 
augmentation of I’ by adding all the instances of the schema. 

A converse of the metatheorem is also true trivially: That is, T} .4—> 2 
implies l +.4+ .%. This direction immediately follows by modus ponens 
and does not require the restriction on. 4. 


Proof. The proof is by induction on I + .4 theorems. 
Basis. Let .2 be logical or nonlogical (but, in the latter case, assume 
B#.%#).ThnTE.Z. 
Since 7 Etat 4 > .%, it follows by 14.1 that"  .4 > 2. 
Now, if .2 = .4, then.4 > .# is a logical axiom (group Ax1); hence 
Tt.4—>.£ once more. 
Modus ponens. Let? +.4+ @, andl +.4+ €7>.f. 
ByIH,0Tb+.4@> @andr+.4>5 €5.%. 
Since. 4 > 6,.4> € > # Erut-4—> .Z,wehave + .4@->.%. 
d-introduction. Let +.4+ @ > Y,and 2 = (Ax) > Y, where x is 
not free in Y. By LH.,T + .4> © > Y.ByI4.1,T > > 4-5 Y; 
hence [+ (Ax)@ > .4@ > Y by 3-introduction (.4 is closed). One more 
application of 1.4.1 yields T+} .4— (Ax) > ZY. 


1.4.20 Remark. (1) Is the restriction that, .4 must be closed important? Yes. 


6699 


Let .4 = x =a, where “a” is some constant. Then, even though.4 + (Vx).4 


<4 
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by generalization, it is not always true that .4 — (Vx).4. This follows from 
soundness considerations (next section). Intuitively, assuming that our logic 
“doesn’t lie” (that is, it proves no “invalid” formulas), we immediately infer 
that x = a — (Vx)x = a cannot be absolutely provable, for it is a “lie”. It fails 
at least over N, if a is interpreted to be “0”. 

(2) 1.4.16 adds flexibility to applications of the deduction theorem: 


be (4> .2)[xX1,...,Xn] (x) 
where [x,,...,X,] is the list of all free variables just in .7Z, is equivalent 
(by 1.4.16) to 

be (46> #yfes,..., en] (>) 
where ¢1,..., €, are new constants added to Y (with no effect on nonlogical 
axioms: [ = I’). 

Now, since. Z[e1,..., @n] is closed, proving 


I’+.4fe,,...,@Jb Ble, ..., en] 


establishes (+), and hence also (+). 

In practice, one does not perform this step explicitly, but ensures that, 
throughout the T + .4 proof, whatever free variables were present in .4 “be- 
haved like constants”, or, as we also say, were frozen. 

(3) Insome expositions the deduction theorem is not constrained by requiring 
that .Z be closed (e.g., Bourbaki (1966b) and more recently Enderton (1972)). 

Which version is right? Both are in their respective contexts. If all the rules 
of inference are “propositional” (e.g., as in Bourbaki (1966b) and Enderton 
(1972), who only employ modus ponens) — that is, they do not meddle with 
quantifiers — then the deduction theorem is unconstrained. If, on the other hand, 
the rules of inference manipulate object variables via quantification, then one 
cannot avoid constraining the application of the deduction theorem, lest one 
want to derive (the invalid) / .4 > (Vx).4 from the valid..4 - (Wx).4. 

This also entails that approaches such as in Bourbaki (1966b) and Enderton 
(1972) do not allow “full” generalization “.4 / (Vx).4”. They only allow a 
“weaker” rule, “if .4, then - (Vx).4”.1 

(4) This divergence of approach in choosing rules of inference has some addi- 
tional repercussions: One has to be careful in defining the semantic counterpart 


¥ Indeed, they allow a bit more generally, namely, the rule “if [ + .4 with a side condition, then 
T+ (Vx).4. The side condition is that the formulas of I do not have free occurrences of x.” Of 
course, I’ can be always taken to be finite (why?), so that this condition is not unrealistic. 
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of , namely, — (see next section). One wants the two symbols to “track each 
other” faithfully (Gédel’s completeness theorem).' © 


1.4.21 Corollary (Proof by Contradiction). Let .4 be closed. ThenT + .4 
iff! + —.4 is inconsistent. 


Proof. If part: Given that .7 = Wff, where .7 is the theory T + —. 4. In par- 
ticular, 7 +—.4+ .4. By the deduction theorem, l + =.4 > .4. But. 4—> 
A -Taut 4. 

Only-if part: Given that P - .4. Hence P+ —.4 + .4 as well (re- 
call 1.3.18(2)). Of course, f+ —41 —.4, Since .4,-.4 Etat -% for an 
arbitrary .2 , we are done. 


Pause. Is it necessary to assume that .4 is closed in 1.4.21? Why? 


The following is important enough to merit stating. It follows from the type 
of argument we employed in the only-if part above. 


4 


1.4.22 Metatheorem. % is inconsistent ifffor some.7%, both, Gand, 74 
hold. 


We also list below a number of “quotable” proof techniques. These tech- 
niques are routinely used by mathematicians, and will be routinely used by us. 
The proofs of all the following metatheorems are delegated to the reader. 


1.4.23 Metatheorem (Distributivity or Monotonicity of 3). For any x, 74, 2, 
> BE (Ax)4 > (Ax).B 


Proof. See Exercise 1.12. 


1.4.24 Metatheorem (Distributivity or Monotonicity of V). Foranyx,.4,.%, 


A> BE Wx)4 > Wx)B 


Proof. See Exercise 1.13. 


The term “monotonicity” is inspired by thinking of “—” as “<”. How? Well, 
we have the tautology 


(42> 8) <(AVB<B) (i) 


+ In Mendelson (1987), = is defined inconsistently with F. 
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If we think of “. ZV .2” as “max(.4, .7)”, then the right hand side in (7) above 
says that .Z is the maximum of .4 and .%, or that . 4 is “less than or equal to” 
.#. The above metatheorems say that both J and V preserve this “inequality”. 


1.4.25 Metatheorem (Equivalence Theorem, or Leibniz Rule). Let [ 
46 <> #, and let &' be obtained from @ by replacing some — possibly, but not 
necessarily, all — occurrences of a subformula .% of € by .#. 

ThnThEé <= @’, i.e, 


is a derived rule. 


Proof. The proof is by induction on formulas @. See Exercise I.15. 


Equational or calculational predicate logic is a particular foundation of first 
© order logic that uses the above Leibniz rule as the primary rule of inference. In 
“practising” such logic one prefers to write proofs as chains of equivalences. 
Most equivalences in such a chain stem from an application of the rule. See 
Dijkstra and Scholten (1990), Gries and Schneider (1994), Tourlakis (2000a, 
2000b, 2001b). 7 


1.4.26 Metatheorem (Proof by Cases). Suppose that + .4, V-+-V.A%q, 
and’ + .4; > .@ fori =1,.... Then .%. 


Proof. Immediate, by 1.4.1. 


Proof by cases usually benefits from the application of the deduction theorem. 
That is, having established [ + .4, V --- V.4,, one then proceeds to adopt, 
in turn, each .4; (i = 1,...,) as anew nonlogical axiom (with its variables 
“frozen’’). In each “case” (.4;) one proceeds to prove .%. 

At the end of all this one has established [ + .7. 


In practice we normally use the following argot: 


“We will consider cases.4;, fori = 1,...,n.! 
Case.74,.  ... therefore,.7.! 
Case.%,. ... therefore,.7.” © 


+ To legitimize this splitting into cases, we must, of course, show TF .4 V +++ V4. 
+ That is, we add the axiom. 7; to I, freezing its variables, and we then prove .7?. 
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1.4.27 Metatheorem (Proof by Auxiliary Constant). Suppose that for arbi- 
trary .4 and .# over the language L we know 


(1) PE (x). 47] 

(2) . +.4[a] + .%, where a is a new constant not in the language L of T. 
Furthermore assume that in the proof of £% all the free variables of [a] 
were frozen. ThenT + .2. 


Proof. Exercise 1.21. 


The technique that flows from this metatheorem is used often in practice. For 

@ example, in projective geometry axiomatized as in Veblen and Young (1916), 
in order to prove Desargues’s theorem on perspective triangles on the plane, we 
use some arbitrary point (this is the auxiliary constant!) off the plane, having 
verified that the axioms guarantee that such a point exists. It is important to 
note that Desargues’s theorem does not refer to this point at all — hence the term 
“auxiliary”. 

In this example, from projective geometry, “.Z” is Desargues’s theorem, 
“(Ax).4[x]” asserts that there are points outside the plane, a is an arbitrary such 
point, and the proof (2) starts with words like “Let a be a point off the plane” — 
which is argot for “add the axiom _4[a]’. 


1.5. Semantics; Soundness, Completeness, Compactness 


So what do all these symbols mean? We show in this section how to “decode” the 
formal statements (formulas) into informal statements of “real”? mathematics. 
Conversely, this will entail an understanding of how to code statements of real 
mathematics in our formal language. 

The rigorous! definition of semantics for first order languages is due to Tarski 
and is often referred to as “Tarski semantics”. The flavour of the particular 
definition given below is that of Shoenfield (1967), and it accurately reflects 
our syntactic choices — most importantly, the choice to allow full generalization 
@+ (¥x).4. In particular, we will define the semantic counterpart of F, namely, 
FE, pronounced “logically implies”, to ensure that l + .4 iff © — .4. This is 
the content of Gédel’s completeness theorem, which we prove in this section. 


This section will place some additional demands on the reader’s recollection 
of notation and facts from informal set theory. We will, among other things, 


¥ One often says “The formal definition of semantics ...”, but the word “formal” is misleading 
here, for we are actually defining semantics in the metatheory, not in the formal theory. 
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make use of notation from naive set theory, such as 


A” (orA x-+-x A) 
—— 


n times 


for the set of ordered n-tuples of members of A. 

We will also use the symbols C, U, Ler! 

In some passages — delimited by © © warning signs — these demands will 
border on the unreasonable. 

For example, in the proof of the Gédel-Mal’cev completeness-compactness 
result we will need some elementary understanding of ordinals — used as index- 
ing tools — and cardinality. Some readers may not have such background. This 
prerequisite material can be attained by consulting a set theory book (e.g., the 


second volume of these lectures). 


1.5.1 Definition. Given a language L = (7, Term, Wff), a structure MN = 
(M, .7) appropriate for L is such that M # @ isa set (the domain or underlying 
set or universe’) and .7 (“.Y” for interpretation) is a mapping that assigns 


(1) to each constant a of Y” a unique member a’ eM, 

(2) to each function f of 7 — of arity n — a unique (total)' function f7: 
M" > M, 

(3) to each predicate P of 7 — of arity n — a unique set P”7 C M".4 


1.5.2 Remark. The structure 92 is often given more verbosely, in conformity 
with practice in algebra. Namely, one “unpacks” the .7 into a lista”, b”,...; 
cs Ge P OQ’, ... and writes instead M=(M;a7, b’, een cae 
g? 5. PY, QO” ,...). Under this understanding, a structure is an underly- 
ing set (universe), M, along with a list of “concrete” constants, functions, and 
relations that “interpret” corresponding “abstract” items of the language. 


Under the latter notational circumstances we often use the symbols a™, f”, 
p™ _ rather than a’, f'7, P” — to indicate the interpretations in 90t of the 
constant a, function f, and predicate P respectively. 


T Tf we have a set of sets {Sa, Sp, Sc, ...}, where the indices a, b,c, ... all come out of an “index 
set” J, then the symbol ();<, S; stands for the collection of all those objects x that are found in 
at least one of the sets S;. It is a common habit to write J?<o 5; instead of Ucn Si. AU B is 
the same as Vier.) S;, where we have let S$; = A and Sp = B. 

= Often the qualification “of discourse” is added to the terms “domain” and “universe”. 

§ Requiring f 7 to be total is a traditional convention. By the way, total means that f 7 is defined 
everywhere on M”. 

{ Thus P7 is an n-ary relation with inputs and outputs in M. 


4 
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We have said above “structure appropriate for L”’, thus emphasizing the gen- 
erality of the language and therefore our ability to interpret what we say in it in 
many different ways. Often though, e.g., as in formal arithmetic or set theory, 
we have a structure in mind to begin with, and then build a formal language to 
formally codify statements about the objects in the structure. Under these cir- 
cumstances, in effect, we define a language appropriate for the structure. We use 
the symbol Ly to indicate that the language was built to fit the structure II. 


1.5.3 Definition. We routinely add new nonlogical symbols to a language L 
to obtain a language L’. We say that L’ is an extension of L and that L is 
a restriction of L'. Suppose that Jt = (M,.7) is a structure for L, and let 
mM’ = (M,.7") be a structure with the same underlying set M, but with .7 
extended to.7’ so that the latter gives meaning to all new symbols while it gives 
the same meaning, as .7 does, to the symbols of L. 

We call 93’ an expansion (rather than “extension”) of I, and WM a reduct 
(rather than “restriction”) of 9Jt’. We may (often) write .7 = .7’ | L to indicate 
that the “mapping” .7’ — restricted to L (symbol “ [”’) — equals .7. We may also 
write It = MM’ | L instead. 


1.5.4 Definition. Given L and a structure t = (M,.7) appropriate for L. 
L(ON) denotes the language obtained from L by adding to 7 a unique new 
name i for each object i € M. 

This amends both sets Term, Wff into Term(2)t), Wff(Jt). Members of the 
latter sets are called SJt-terms and 9N-formulas respectively. 


: 57 . . 
We extend the mapping .7 to the new constants by:i = i for alli ¢ M 
(where the “=” here is metamathematical: equality on M). 


All we have done here is to allow ourselves to do substitutions like [x < i] 
formally. We do, instead, [x < i]. One next gives “meaning” to all closed 
terms in L(9Jt). The following uses definition by recursion (I.2.13) and relies 
on the fact that the rules that define terms are unambiguous. 


1.5.5 Definition. For closed terms t in Term(0t) we define the symbol t7 ¢ M 
inductively: 


(1) If t is any of a (original constant) or i (imported constant), then t7 has 


already been defined. 
(2) If t is the string ft, ...t,, where f is n-ary, and ft}, ..., ft, are closed IN- 
terms, we define ¢7 to be the object (of M) ace ey ey 


Finally, we give meaning to all closed M-formulas, again by recursion (over 
wrff). 
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1.5.6 Definition. For any closed formula .4 in Wff(9%) we define the symbol 
_%7 inductively. In all cases,.Z7 € {t, f}. 


(1) If. 4 =t = s, where t ands are closed Mt-terms, then. 4” = tifft? = 57. 


(The last two occurrences of “=” are metamathematical.) 

(2) If .4 = Pt, ...ty,, where P is an n- ead predicate and the ¢; ae closed 
Mt-terms, then A = = tiff lear ee 7) € P’ or P(t; ae 2 7) “holds”. 
(Or “is true”; see p. 19. Of course, the last occurrence of “=” is metamathe- 
matical.) 


(3) If .4 is any of the sentences 7,.7 v %, then. 47 is determined by 
the usual truth tables (see p. 30) using the values .7” and #7. That is, 
(22)? = F(A’) and (BV)? = Fy(27, & 7). (The last two oc- 


currences of “=” are metamathematical.) 
(4) If. 4 = (ax), then. 4” = tiff (A[x < i])” =tfor somei € M. (The 
last two occurrences of “=” are metamathematical.) 


We have “imported” constants from M into L in order to be able to state 
the semantics of (4x).# above in the simple manner we just did (following 
Shoenfield (1967)). 

We often state the semantics of (Ax).% by writing 


((Ax).A[x])7 istrue iff (die M).Alily” is true © 


1.5.7 Definition. Let 4 ¢ Wff, and Jt be a structure as above. 


An N-instance of .4 is an Mt-sentence .A(i,, ..., iz) (that is, all the free 
variables of .4 have been replaced by imported constants). 

We say that. Zis valid in MN, or that Mis a model of. 4, iff for all M-instances 
4! of 4 it is the case that .4’” =t.' Under these circumstances we write 
Fo 4 

For any set of formulas I from Wff, the expression Fg I’, pronounced “Qt 
is a model of TY”, means that for all. 4 €T, Eo 4. 

A formula . 4 is universally valid or logically valid (we often say just valid) 
iff every structure appropriate for the language is a model of .. 

Under these circumstances we simply write — . 4. 

If T is a set of formulas, then we say it is satisfiable iff it has a model. It is 
finitely satisfiable iff every finite subset of T has a model.! 


er The definition of validity of .4 in a structure IM corresponds with the normal 
mathematical practice. It says that a formula is true (in a given “context” SJt) 
just in case it is so for all possible values of the free variables. © 


+ We henceforth discontinue our pedantic “(The last occurrence of “=” is metamathematical.)”. 
= These two concepts are often defined just for sentences. 
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1.5.8 Definition. We say that T logically implies .4, in symbols T — .4, 
meaning that every model of T is also a model of .4. 


1.5.9 Definition (Soundness). A theory (identified by its nonlogical axioms) 
I’ is sound iff, for all.4 € Wff, P | .4 implies T — .4, that is, iff all the 
theorems of the theory are logically implied by the nonlogical axioms. 


Clearly then, a pure theory T is sound iff +z .4 implies E .# for all. 4 € Wf. 
That is, all its theorems are universally valid. © 


Towards the soundness result' below we look at two tedious (but easy) 
lemmata. 


1.5.10 Lemma. Given a term t, variables x 4 y, where y does not occur inf, 
and a constant a. Then, for any term s and formula .4, s[x <— tlly <a] = 
sly <a]lx < tl and. 4x < tlly <a] =.4p < all <1]. 


Proof. Induction on s: 


Basis: 
ifs =x thent 
ifs =y thena 
s[x <—tlly <a] = ifs =z wherex #z# y, thenz 
ifs =b thenb 


= sly <—a]lx —1¢] 
For the induction step let s = fr, ...r,, where f has arity n. Then 


s[x <—t]ly <a] = frlx <— tlh <al]...n[x < tll <a] 


fnliy <allx <t]...nmly <a][x <t]  byIH. 
=s[y <a][x < ft] 
Induction on 4%: 
Basis: 
Ax < tly <a] 
if.4 = Pr,...r, then 
Pri{x <— tly <a]...n[x <—t]b <a] 


= Prily <a][x <f¢]...nm[y <a][x <7] 
if.4@ =r=s then 
rilx —t]ly <a] =s[x <—t]ly <a] 


=r[y <al[x < ft] =sly <a]lx < ¢] 


=.4Ly < allx <1] 


¥ Also nicknamed “the easy half of Gédel’s completeness theorem”. 
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The property we are proving, trivially, propagates with Boolean connectives. 
Let us do the induction step just in the case where .4 = (Aw).%. If w = x or 
w = y, then the result is trivial. Otherwise, 


A < tly <a] = (Gw) Ax < Ally al 

= ((Aw).4[x < t]ly <— a]) 

= (Aw). ZLy < a]l[x < 1£]) by LH. 
(Aw). )ly <— al[x <— t] 
Aly <— allx <1] 


1.5.11 Lemma. Given a structure IN = (M,.7), a term s, and a formula .4, 
both over L(QN). Suppose each of s and .% have at most one free variable, x. 

Let t be a closed term over L(QM) such that t? = i € M. Then (s[x <— 
t))? =(s[x <i])” and. 4x < 1t))” = (Ax < i])”. Of course, since t is 
closed, 4[x < t] is defined. 


Proof. Induction on s: 

Basis. s[x <— t] =s ifs € {y,a, j} (vy #x). Hence (s[x < t])”? =s7 = 
(s[x <i])” in this case. If s = x, then s[x < ft] =f and s[x <i] =i, and 
the claim follows once more. 


For the induction step let s = fr, ...r,, where f has arity n. Then 


(six << t)? = f7 (ne < t))”,..., Gale <— 1D”) 
= f7(n[x <i))”,...,(ralx <i])”) by LH. 
= (s[x < i])” 


Induction on #4: 
Basis. lf 4 = Pr, ...r,, thent 


(4x < t))? = P7(n[x <1t))’,..., lx <—t)”) 
= PZ (nbs TY... Gale <7) 
= (Ax <i)” 


Similarly if. 4=r=s. 


The property we are proving, clearly, propagates with Boolean connectives. 
Let us do the induction step just in the case where.4 = (Aw).%. If w = x, the 
result is trivial. Otherwise, we note that — since t is closed — w does not occur 


+ For a metamathematical relation Q, as is usual (p. 19), O(a, b,...) = t, or just Q(a,b,...), 
stands for (a, b,...) € Q. 
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in t, and proceed as follows: 


(Ax <—t])” =t iff (Aw).%)[x <1]? =t 
iff ((aw).A[x <t]))” =t 
iff (@[x < t][w < j])” =t for some j € M, by 1.5.6(4) 
iff (@[w < j][x <t])” =t for some j € M, by 1.5.10 
iff ((Zlw < j)[x < t])” =t for some j ¢ M 
iff ((Zlw < j))[x <i])” =t for some j € M, by LH. 
iff (Z[w < j|lx <i])” =t for some j ¢ M 
iff (@[x <i][w <— j])” =t for some j € M, by 1.5.10 
iff ((Aw)A[x <i)” =t by L5.6(4) 
iff ((Aw).A)[x <i])” =t 
iff (4x <i)’ =t 


1.5.12 Metatheorem (Soundness). Any first order theory (identified by its non- 
logical axioms) T, over some language L, is sound. 


Proof. By induction on I'-theorems, .4, we prove that [ — .4. That is, we fix 
a structure for L, say 9Jt, and assume that yy I. We then proceed to show that 
Eon 7%. 

Basis. .% is a nonlogical axiom. Then our conclusion is part of the assump- 
tion, by 1.5.7. 

Or .4 is a logical axiom. There are a number of cases: 


Case 1. ~ taut 4. We fix an I-instance of .4, say .4’, and show that 
17 = t. Let P1,++++ Pn be all the propositional variables (alias prime 
formulas) occurring in. 4’. Define a valuation v by setting v(p;) = py 
fori = 1,...,n. Clearly, t = 04’) = Dae (the first “=” because 
-Eaut -Z’, the second because after prime formulas were taken care of, 
all that remains to be done for the evaluation of .4'” is to apply Boolean 
connectives — see I.5.6(3)). 


Pause. Why is E taut 4’? 


Case 2..4= #|t] > (x).#. Again, we look at an St-instance .7’[t'] > 
(Ax).Z’. We want (.4’[t’] > (Ax).#')” = t, but suppose instead that 


(2'')? =t (1) 
and 
((Ax).Z’)’ =f (2) 


Let ’”? = i (i € M). By 15.11 and (1), (4'[i])” = t. By L5.6(4), 
((Ax).#')” = t, contradicting (2). 
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Case 3..4 = x = x. Then an arbitrary M-instance is i = i for somei € M. 
By 1.5.6(1), i =i)” =t. 

Case 4. 4=t=s > (4[t] - #Z[s]). Once more, we take an arbitrary 
M-instance, t’ = s’ > (F'[t'] — .A'[s’]). Suppose that (t/ = s’)7 = t. 
That is, 1’? = 5’? = (let us say) i (in M). But then 


(Bt? =(B TU)? — by 15.11 
=(4'[s'))”? by 15.11 


Hence (.A[t] > .A[s])”7 =t. 
For the induction step we have two cases: 


Modus ponens. Let. and. 2 — .4 be I’-theorems. Fix an t-instance 
B! + A Since. 2’, B! > 74 taut 4", the argument here is entirely 
analogous to the case.4 € A (hence we omit it). 


4-introduction. Let .4 = (Ax). 2 > andl + 2 > ZF, where x is not 
free in &. By the LH. 


Em 2 > & (3) 


Let (Ax).%'’— @' be an M-instance such that (despite expectations) 
((Ax).Z')7 = t but 


g7 =f (4) 
Thus 
By =t (5) 


for somei € M. Since x is not free in 4, .7’[i] > %’ isa false (by (4) and (5)) 
M-instance of .7 — %, contradicting (3). 


We used the condition of 4-introduction above, by saying “Since x is not free 
? in &, B'li] > &' is a(n)... Mt-instance of .7 > #”. 
So the condition was useful. But is it essential? Yes, since, for example, if 
x#y,thlnx=yroxu=yKk(Aaxyx=yrox=y. © 


As a corollary of soundness we have the consistency of pure theories: 
1.5.13 Corollary. Any first order pure theory is consistent. 


Proof. Let £ be a pure theory over some language L. Since K —x = x, it 
follows that 4x —x = x, thus.7 4 Wff. 
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1.5.14 Corollary. Any first order theory that has a model is consistent. 


Proof. Let & be a first theory over some language L, and SJt a model of T. 
Since Foy —x = x, it follows that 4; —x = x, thus.7 4 Wf. 


& First order definability in a structure: We are now in a position to make the 
process of translation to and from informal mathematics rigorous. 


1.5.15 Definition. Let L be a first order language, and 90 a structure for L. A 
set (synonymously, relation) S C M" is (first order) definable in IN over L 
iff for some formula.“(y1,..., Yn) (see p. 18 for a reminder on round-bracket 
notation) and for alli;, j =1,...,n,in M, 


Gijy.seoih) CS iff Eon WG1,..., in) 
We often just say “definable in 2”. 


A function f : M" — M< is definable in SJ over L iff the relation y = 
F(x1,.---,%n) 18 so definable. 


N.B. Some authors say “(first order) expressible” (Smullyan (1992)) rather 
than “(first order) definable” in a structure. 


In the context of (9Jt), the above definition gives precision to statements such 
as “we code (or translate) an informal statement into the formal language” or 
“the (formal language) formula .4 informally ‘says’ ...”, since any (informal) 
“statement” (or relation) that depends on the informal variables x;,..., x, has 
the form “(x),...,%,) € S” for some (informal) set S. It also captures the 
essence of the statement. 

“The (informal) statement (x1, ..., Xn) € S can be written (or made) in the 
formal language.” 

What “makes” the statement, in the formal language, is the formula .” 
that first order defines it. 


1.5.16 Example. The informal statement “z is a prime” has a formal translation 
S0<zA(WWx)\Vy\(z=xXyox=zVx=S0) 


over the language of elementary number theory, where the nonlogical symbols 
are 0,.$,+, x, < and the definition (translation) is effected in the standard 
structure Yt = (N;0;S,+, x; <), where “S” satisfies, for alln € N, S(n) = 
n+ 1 and interprets “S” (see 1.5.2, p. 53, for the “unpacked” notation we have 
just used to denote the structure 9t). We have used the variable name “z” both 
formally and informally. 
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It must be said that translation is not just an art or skill. There are theoretical 
limitations to translation. The trivial limitation is that if M is an infinite set and, 
say, L has a finite set of nonlogical symbols (as is the case in number theory 
and set theory), then we cannot define all S C M, simply because we do not 
have enough first order formulas to do so. 

There are non-trivial limitations too. Some sets are not first order definable 
because they are “far too complex”. See Section I.9. 


This is a good place to introduce a common notational argot that allows us to 
write “mixed-mode” formulas that have a formal part (over some language L) 
but may contain “informal” constants (names of, to be sure, but names that have 
not formally been imported into L) from some structure 90 appropriate for L. 


1.5.17 Informal Definition. Let L be a first order language and IN = (M,.7)a 


structure for L. Let .4 be a formula with at most x,,..., X, free, andi,,..., in 
members of M. The notation .4 [[i,, ..., i, ]] is an abbreviation of (4[i),..., 
W\7 
in])”. 


This argot allows one to substitute informal objects into variables outright, 
by-passing the procedure of importing formal names for such objects into the 
language. It is noteworthy that mixed mode formulas can be defined directly by 
induction on formulas — that is, without forming L(5)) first — as follows: 


Let L and 99 be as above. Let x;,..., x, contain all the free variables that 
appear in a term ¢ or formula .4 over L (not over L(9M)!). Let i1,..., in be 


arbitrary in M. 
For terms we define 


ay ift=x (<j <n) 
: : a’ ift=a 
t Pores a | eo j id . 
Li. inl fF (Altis ihc tala 
trllit,..-. inl) iff = ft...t, 


For formulas we let 


tli, .--inl =slli,..-in] if. 4=t=s 
P? (tli,...,in],.--, 
. : t- ii, .- +, inl) if. 4= Ph...t, 
AM ---ind=) Cr iq) if. 4=-B 


(Ali,-.--inlV li, ---inl) if. 4=.BVE 
(da € M).Alla, i, .. in] if 4 = (Az).4lz, Fn] 
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where ‘(da € M)...” is short for “(Ga)(a € MA” ...)”. The right hand side 
has no free (informal) variables; thus it evaluates to one of t or f. © 


We now turn to the “hard half” of Gédel’s completeness theorem, that our 
syntactic proof apparatus can faithfully mimic “proofs by logical implication”. 
That is, the syntactic apparatus is “complete”. 


1.5.18 Definition. A theory over L (designated by its nonlogical axioms) I" is 
semantically complete iff  — .4 implies T | .4 for any formula. 7. 


The term “semantically complete” is not being used much. There is a competing 

@ syntactic notion of completeness, that of simple completeness, also called just 
completeness. The latter is the notion one has normally in mind when saying 
“”..a complete theory ...”. More on this shortly. © 


We show the semantic completeness of every first order theory by proving, 
using the technique of Henkin (1952), the consistency theorem below. The 
completeness theorem will then be derived as a corollary. 


1.5.19 Metatheorem (Consistency Theorem). /f a (first order) theory & is 
consistent, then it has a model. 


We will first give a proof (via a sequence of lemmata) for the case of “count- 
able languages” L, that is, languages that have a countable alphabet. We will 
then amend the proof to include the uncountable case. 


A crash course on countable sets: A set A is countable’ if it is empty or (in 
the opposite case) if there is a way to arrange all its members in an infinite 
sequence, in a “row of locations”, utilizing one location for each member of N. 
It is allowed to repeatedly list any element of A, so that finite sets are countable. 
Technically, this enumeration is a (total) function f :.“’— A whose range 
(set of outputs) equals A (that is, f is onto). We say that f (n) is the nth element 
of A in the enumeration f. We often write /, instead of f(n) and then call na 
“subscript” or “index”. 


We can convert a multi-row enumeration 
(fi, pi, in N 


1 Naively speaking. The definition is similar to the formal one that is given in volume 2. Here we 
are just offering a quick-review service — in the metamathematical domain, just in case the reader 
needs it — in preparation for the proof of the consistency theorem. 
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into a single row enumeration quite easily. Technically, we say that the set N x 
N — the set of “double subscripts” (i, 7) — is countable. This is shown diagram- 
matically below. The “linearization” or “unfolding” of the infinite matrix of 
rows is effected by walking along the arrows: 


(0, 0) (0, 1) (0, 2) (0, 3) 
ye 

(1, 0) (1, 1) (1, 2) 

(2, 0) (2, 1) 


(3, 0) 


This observation yields a very useful fact regarding strings over countable sets 
(alphabets): If Y is countable, then the set of all strings of length 2 over 7 
is also countable. Why? Because the arbitrary string of length 2 is of the form 
d,d; where d; and d; represent the ith and j elements of the enumeration of 7” 
respectively. Unfolding the infinite matrix exactly as above, we get a single-row 
enumeration of these strings. 


By induction on the length n > 2 of strings we see that the set of strings 
of any length n > 2 is also countable. Indeed, a string of length n+ 1 is a 
string ab, where a has length n and b € 7. By the IH. the set of all a’s can be 
arranged in a single row (countable), and we are done exactly as in the case of 
the d;d; above. 


Finally, let us collect al/ the strings over 7 into a set S. Is S countable? Yes. 
We can arrange S, at first, into an infinite matrix of strings m,,;, that is, the jth 
string of length i. Then we employ our matrix-unfolding trick above. 


Suppose now that we start with a countable set A. Is every subset of A 
countable? Yes. If B C A, then the elements of B form a subsequence of the 
elements of A (in any given enumeration). Therefore, just drop the members of 
A that are not in B, and compact the subscripts.’ © 


To prove the consistency theorem let us fix a countable language L and a 
first order theory & over L with nonlogical axioms I. In the search for a model, 


+ By “compact the subscripts” we mean this: After dropping members of A that do not be- 
long to B, the enumeration has gaps in general. For example, B might be the subsequence 
413, A95, 496, 497, 41001, --.. We just shift the members of B to the left, eliminating gaps, so 
that the above few listed members take the locations (subscripts), 0, 1,2, 3,4,..., respectively. 
Admittedly, this explanation is not precise, but it will have to do for now. 
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we start with a simple countable set, here we will take N itself, and endow it 
with enough structure (the .7-part) to obtain a model, It = (M, .7) of ©. 

Since, in particular, this will entail that a subset of N (called M in what 
follows) will be the domain of the structure, we start by importing all the 
constants n € N into L. That is, we add to Y a new constant symbol 7 for 
eachn & N. The new alphabet is denoted by 7’(N), and the resulting language 
L(N). 


1.5.20 Definition. In general, let L = (7, Term, Wff) be a first order lan- 
guage and M some set. We add to 7 a new constant symbol i for each 
iéM. The new alphabet is denoted by 7(M), and the new language by 
L(M) = (7 (M), Term(M), Wff()). 

This concept originated with Henkin and Abraham Robinson. The aug- 
mented language L(M) is called the diagram language of M. 


The above definition generalizes Definition 1.5.4 and is useful when (as 
happens in our current context) we have a language L and a set (here N), but 
not, as yet, a structure for L with domain N (or some subset thereof). 

Of course, if Nt = (M, .7), then L(M) = L(MN). 


Two observations are immediate: One, I’ has not been affected by the addi- 
tion of the new constants, and, two, L(N) is still countable.i Thus, there are 
enumerations 


Fy, F,, Fy, ... of all sentences in Wff(N) (1) 
and 
G,, G5, ¥3,... Of all sentences in Wff(N) of the form (Ax).Z (2) 
where, in (2), every sentence (Ax). 4 of Wff(N) is listed infinitely often. & 


Pause. How can we do this? Form an infinite matrix, each row of which is the 
same fixed enumeration of the (4x). 4 sentences. Then unfold the matrix into a 
single row. 


With the preliminaries out of the way, we next define by induction (or re- 
cursion) over N an infinite sequence of theories, successively adding sentences 
over L(N) as new nonlogical axioms: We set 9 = I’. For any n > 0, we define 


t If A and B are countable, then so is A U B, since we can arrange the union as an infinite matrix, 
the Oth row being occupied by A-members while all the other rows being identical to some fixed 
enumeration of B. 
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T’,41 in two stages: We first let 


T,U{F%} if, 47%, 
A = n {F,,} 1 a aN (3) 


Ir, otherwise 


Then, we set 


OZ 
where ‘7,1 


i ULAx<i]} ifA, KY =(4x).4 
Moi = 


n+l? 
An otherwise 


The choice of i is important: It is the smallest i such that the constant i does 
not occur (as a substring) in any of the sentences 


DSA 4 Dt gl ae (4) 
The sentence . 4[x < i] added to I’, is called a special Henkin axiom.' The 
constant i is the associated Henkin (also called witnessing) constant. 
We now set 
aa oe (5) 
neN 


This is a set of formulas over Wff(N) that defines a theory {,, over Wff(N) (as 
the set of nonlogical axioms of {,,). 


1.5.21 Lemma. The theory Ly is consistent. 


Proof. It suffices to show that each of the theories I’, is consistent. 


(Indeed, if Fe, —x = x,! then.4,...,.4, F ax = x, for some .4,(i = 
1,...,m) in I, since proofs have finite length. Let I, include all of 
FOR om: 


Pause. Is there such a I,,? 
Then I, will be inconsistent.) 


On to the main task. We know (assumption) that [9 is consistent. We take 
the I.H. that I’, is consistent, and consider I’,,,; next. 


First, we argue that A,, is consistent. If A, = I’,,, then we are done by the 
1H. If A, = 1, U{¥,,}, then inconsistency would entail (by 1.4.21) [ k 3, 
contradicting the hypothesis of the case we are at (top case of (3) above). 


+ Another possible choice for Henkin axiom is (Ax). 4 > .4[x <i]. 
= sy = x is our favourite “contradiction builder”. See also 1.4.22. 
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Next, we show that A, U {.4[x<i]} is consistent, where .4[x <i] is the 
added Henkin axiom — if indeed such an axiom was added. 

Suppose instead that A, U {.4[x<—i]} - -z = z, for some variable z. Now 
i does not occur in any formulas in the set A,, U {(Ax).4, -z = z}. 

Since (Ax).4 = ¥,,, and A, | ¥,,,, we get A, F 7z = z by 14.27 
(auxiliary constant metatheorem). This is no good, since A,, is supposed to be 
consistent. 


1.5.22 Definition. A theory T over L decides a sentence . 4 iff one of Fz .4 
or F< —.4 holds. We say that .4 is decidable by TZ. In the case that kz 3.4 
holds, we say that T refutes ..4. T is simply complete, or just complete, iff every 
sentence is decidable by %. 


& The definition is often offered in terms of consistent theories, since an incon- 
sistent theory decides every formula anyway. er 


1.5.23 Lemma. %.. is simply complete. 


Proof. Let. 4beasentence. Then. 4 = FY, forsomen.IfT, + —F,, then we are 
done. If however I’, does not refute .F, , then we are done by (3) (p. 65). 


1.5.24 Lemma. %., has the witness property,’ namely, whenever ts, (Ax).4, 
where (Ax).4 is a sentence over L(N), then for some m € N we have Fs, 
Ax <—m). 


Proof. Since the proof Fs, (4x).4 involves finitely many formulas from 49, 
there is ann > O such that A, F (Ax). 4. Now, (Ax).4 = -¥,,, for some 


k > n, since (Ax). 4 occurs in the sequence (2) (p. 64) infinitely often. But then 
Ax & (x).4 as well. 


Pause. Why? 


Hence, .4[x < m] is added to I’,,; as a Henkin axiom, for an appropriate 
Henkin constant m, and we are done. 


1.5.25 Definition. We next define a relation, ~, on N by 


nvm iff Fe,n=m (6) 


+ We also say that it is a Henkin theory. 
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~ has the following properties: 


(a) Reflexivity.n ~ n (alln): By x = x and 1.4.12. 

(b) Symmetry. Ifn ~ m, thenm ~ n (all m,n): It follows from x = y > 
y = x (Exercise I.16) and I.4.12. 

(c) Transitivity. Ifn ~ m andm ~ k, thenn ~ k (all m,n, k): It follows from 
Fxe=yoy=z7—-> x =z (Exercise [.17) and 1.4.12. 


Let us define a function f : N > N by 
f() = smallest m such that m ~ n (7) 
By (a) above, f is totally defined. We also define a subset M of N, by 
M={f(n):neN} (8) 


This M will be the domain of a structure that we will build, and show that it is 
a model of the original T. 


First, we modify . “downwards”. 


1.5.26 Definition. The M-restriction of a formula . 4 is the formula ..4” ob- 
tained from . 4 by replacing by f(n) every occurrence of ann in 4. 


We now let 
TM = 16": 4ET oo} (9) 


We have the following results regarding '™ (or the associated theory €“): 


© 1.5.27 Remark. Before proceeding, we note that the language of T¥ is L(M), 
and that.4=.4” if .4 is over L(M), since f(m)=m form € M (why?). & 


1.5.28 Lemma. Ler. 7 be over L(N). If Fs, 4, then tau . 5M 


Proof. Induction on {,.-theorems. 
Basis. If. 4 € To, then. 4” € T™ by (9). If. 4 € A, then. 4” € A (why?) 
Modus ponens. Let 5 + .# andl + .2 > .4. By the LH., r“¥ +.B! 

and“ + .A™ — _4™, and we are done. 
d-introduction. Let Ta.2— %, where x is not free in % and 

63 (Ax). > &. By the LH., 7P% + 2" + &™, and we are done by 

d-introduction. 
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1.5.29 Lemma. aes < Yoo, conservatively. 


Proof. Leaving the “conservatively” part aside for a moment, let us verify that 
for any .4 over L(N) 


Toot 6M o (x) 


This follows from 5 Fk nm = f(n) (recall thatn ~ f(n) by definition of f) for 
alln € N, and Exercise 1.18. 

Because of (*), M4, can prove any.7 € T'™. Indeed, let.7 =.4” for some 
4 € Vo (by (9)). By («) and F 4, we obtain 45 FF. 

Thus, nse < Gas. 


Pause. Do you believe this conclusion? 


Turning to the “conservatively” part, this follows from Lemma I.5.28 and 
Remark 1.5.27. 


1.5.30 Corollary. &™ is consistent. 


Proof. Vf it can prove —x = x, then so can TQ. 


1.5.31 Lemma. &” is simply complete. 


Proof. Let the sentence .4 be over L(M). It is decidable by Too (1.5.23). 
By 1.5.28, T” decides .4™. But that’s . 4. 


1.5.32 Lemma. oY is a Henkin theory over L(M). 


Proof. Let 1™ + (Ax).4, where (Ax). 4 is a sentence. Then I’, (Ax). 4 as 
well, hence [., F .4[x <7], for some n € N (by 1.5.24). 

It follows that 1” + .4”[x<— f(n)], and we are done, since. 4 =.4™ and 
f(nyeM. 


1.5.33 Lemma. &™ distinguishes the constants of M, that is, ifn 4 m (both 
in M), then -gu 7n =m. 


Proof. By 1.5.31, if F—au 7m = m fails, then Fu nm = m; hence n ~ m. By 
the definition of f(m) and M (p. 67), it follows that n = m (since each set 
{m € N: m ~ n} determined by n can have exactly one smallest element, and 
any two distinct such sets — determined by n and k — have no common elements 
(why?)). This contradicts the assumption that n 4 m. 
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We have started with a consistent theory { over L. We now have a consistent, 

© complete, Henkin extension T™ of T, over the language L(M)(M CN), which 
distinguishes the constants of M. 

We are now ready to define our model 2t = (M, .7). We are pleased to note 

that the constants of M are already imported into the language, as required by 


Definitions 1.5.5 and 1.5.6. © 
For any predicate’ P of L of arity k, we define for arbitrary n;,...,, in M 
P7(ny,...,m) =t iff TM! + Pay,...,m (A) 


that is, we are defining the set of k-tuples (relation) P-” by 
PP = nyse) 2 BMF Pas esse} (A’) 


Let next f be a function letter of arity k, and let n;,..., mz be an input for f a 
What is the appropriate output?! 

Well, first observe that ry F (Ax) fn,...,2, = x (why?). By 1.5.32, there 
is anm € M such that 


TE fii tem (B) 


We need this m to be unique. It is so, for if also! + fm,...,m, = j, then 
(Exercise 1.17) 1% + m = j, and thus m = j (ifm # j, then also” + ~m = 
j, by 1.5.33 — impossible, since I” is consistent). 


For the input 71, ..., 2% we set 


F7 (m1, 00.5) =m (B.1) 
where m is uniquely determined by (B). This defines f”. 


The case of constants is an interesting special case.! As above, we let a” be 
the unique m € M such that 


bha=m (C) 


The interesting, indeed crucial, observation (required by 1.5.5) is that, for any 


imported constant m, we have m’ = m. Indeed, this follows from uniqueness 


and the trivial fact (¥% + m = m. 


¥ Recall that we have added no new predicates and no new functions in going from L to L(M). 
We have just added constants. 

= We have yet to determine f”. Here we are just “thinking out loud” towards suggesting a 
good f”. 

8 Actually, it is not a special case for us, since we did not allow O-ary functions. But some au- 
thors do. 
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The following will be handy in the proof of the Main Lemma below: 
rib t=t7 (D) 


for any closed term t over L(M). We prove (D) by induction on terms f¢. 


Basis. If t = a (a original or imported), then (C) reads T° es ba=a’. 


Lett = fty...t.ByL.5.5,t7 = f7(t),...,t).Setn) =f) fori =1,...,k. 
Then t? = f7(nj,..., x). By (B) and (B.1), 


IME fi, ...,% = f7 (M1, Me) 


By the LH., PY’ Ft = ra in other words 1“ + t; = 7;, fori = 1,...,k. By 
Ax4, we obtain (via Exercise I.19) 


Pee figec. ote = fio 


Thus (Exercise I.17), 


Pe pee ee 


This concludes the proof of (D). 


1.5.34 Main Lemma. For every sentence. 4 over L(M),.47 = tiff1™ + .#. 


Proof. This is proved by induction on formulas. 

Basis. Case where 4 = Pt,...ty: Let n; = t, (i = 1,...,k). By (A) 
above (p. 69), P7(m,...,m,) = tiff [! + Pry... iff PM F Ph... - 
the second “iff” by Ax4 (via Exercise I.19) and (D). We are done in this case, 
since. 47 = PI (ty She -): 

Case where. 4 = t = s: Letalson = t” andm = s”. Then. @ = t iff 
t? = 8” iffn = m iff [+ 7 = m (for the last “iff” the if part follows by 
consistency and I.5.33; the only-if part by Ax3 and 1.4.12). 

The induction steps. Let.4 = —.%. Then. 4” = t iff #7 = f iff (LH.) 
I }¢ .# iff (completeness 1.5.31 (>) and consistency 1.5.30 (<-)) P@ + 4. 

Let.4 = .% v %. Consider first. 27 v #7 = t. Say #7 = t. By LH, 
PY’ + #:hence P+ Lv # bytautological implication. Similarly if 77 = t. 
Conversely, let [¥ + .2 v &. Then one of [!@ + .Z or P¥% + ¥ must be the 
case: If neither holds, then PY + —4 and PY + =% by completeness, hence 
I’ is inconsistent (why?). 

The final case:.4 = (Ax)%. Let ((Ax).Z)” = t. Then (Z[x<—7])” =t 
for some n € M. By LH., 7“ + .A[x<—ni]; hence PY + (ax).Z by Ax2 and 
MP. Conversely, let 1% + (Ax).%. By 1.5.32, rP¥ + .A[x <7], where 7 is the 
appropriate Henkin constant. By LH., (#[x<7])” =t; hence. 4” =t. 


Finally, 
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Proof (of the Consistency Theorem). Let .4’ be an )0-instance of a formula 
inl. 

Note that .4 is over L. 

From C I it follows that P¥ + .4, and hence P¥ + .4’, by 1.4.12. By 
the Main Lemma, we at 

Thus, the reduct 90’ = (M,.7 | L) isa model of T. 


We had to move to the reduct Jt’ to be technically correct. While I “satisfies” 
I, its 7 also acts on symbols not in L. The .7 of a structure appropriate for L 
is only supposed to assign meaning to the symbols of L. cr 


1.5.35 Corollary. A consistent theory over a countable language has a count- 
able model. 


1.5.36 Corollary (L6wenheim-Skolem Theorem). [fa set of formulas T over 
a countable language has a model, then it has a countable model. 


Proof. Tf a model exists, then the theory I is consistent. 


1.5.37 Corollary (Gédel’s Completeness Theorem). /n any countable first 
order language L, T —.4 implies + .4. 


Proof. Let .# denote the universal closure of 4. By Exercise 1.43, FT —& .2. 
Thus, [+—.7 has no models (why?). Therefore it is inconsistent. Thus, [+ .# 
(by 1.4.21), and hence (specialization), Tt .4. 


A way to rephrase completeness is that if [ E.4, then also A E.4, where 
A CT is finite. This follows by soundness, since [ — .4 entails [+ .4 and 
hence At-.%, where A consists of just those formulas of IP used in the proof 


of 4. © 


1.5.38 Corollary (Compactness Theorem). Jn any countable first order lang- 
uage L, a set of formulas TV is satisfiable iff it is finitely satisfiable. 


Proof. Only-if part. This is trivial, for a model of T° is a model of any finite 
subset. 

If part. Suppose that T’ is unsatisfiable (it has no models). Then it is in- 
consistent by the consistency theorem. In particular, [ F —x = x. Since the 
pure theory over L is consistent, a [-proof of ~x = x involves a nonempty 


finite sequence of nonlogical axioms (formulas of I), .4,...,.4,. That is, 
4), 0105-4 F ax = x; hence {.41,...,.4,} has no model (by soundness). 


This contradicts the hypothesis. 
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& Alternatively, we can prove the above by invoking “syntactic compactness”: A 
set of formulas is consistent iff every finite subset is consistent, since proofs 
have finite length. Now invoke the consistency theorem and [.5.14. © 


We conclude this section by outlining the amendments to our proof that will 

@ @ remove the restriction that L is countable. This plan of amendments presupposes 
some knowledge? of ordinals and cardinality (cf. volume 2) beyond what our 
“crash course” has covered. The reader may accept the statements proved here 
and skip the proofs with no loss of continuity. These statements, in particular 
the Gédel-Mal'cev compactness theorem, are applied later on to founding non- 
standard analysis (following A. Robinson). 

Let L be (possibly) uncountable, in the sense that the cardinality t of 7” 
is > w. The cardinality of the set of all strings over / is also € (for a proof see 
volume 2, Chapter VII). We now pick and fix for our discussion an arbitrary set 
N of cardinality €. 


& 1.5.39 Remark. An example of such a set is € itself, and can be taken as N. 
However, we can profit from greater generality: N can be any set of any type 
of (real) objects that we choose with some purpose in mind, as long as its 


cardinality is €. © 


Therefore the elements of N can be arranged in a transfinite sequence (indexed 
by all the ordinals a < €) 


10, 11,665 Magy ees 
We then form L(V) (to parallel L(N) of our previous construction) by adding 
to Y a distinct name 1, for each n, € N. Thus, we have enumerations 
Fy, F\,Fy,...,F%,y,... of all sentences in Wff(V) (1’) 
and 
G, Go, G3,-+.,F%,,+-. Of all sentences in Wff(V) of the form (Ax). 4 (2’) 
where, in (2’), every sentence (Ax). 4 of Wff(V) is listed infinitely often, and 


the indices (subscripts) in both (1’) and (2’) run through all the ordinals aw < € 
(omitting index 0 in the second listing). 


We next proceed as expected: We define by induction (or recursion) over 
the ordinals aw < € a transfinite sequence of theories (determined by I’ and 
additional sentences over L(N) as nonlogical axioms'): 


+ On an informal level, of course. All this is going on in the metatheory, just like the countable 
case. 
¥ Note that the formulas in I need not be closed. 
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We set 9 = I. For any a < €, we define I, to be pay Ig just in case a 
is a limit ordinal. If @ = B + 1 (a successor) then the definition is effected in 
two stages: We first let 


TgU{F,} iff, 7AF, 

oe B B B f 

Ag = 3 
P fe otherwise 2 
Then, we set Ty = Ag U{.4[x <— i]} just in case Ap lk ¥,, where Y, = 

(Ax). 4%. 

em choice of i is important: i = ng € N, where a < € is the smallest index 
such that the constant 7 does not occur (as a substring) in any of the sentences 


Havana Aor ea Og (4’) 
The sentence .4[x < i] added to I, is called a special Henkin axiom. The 
constant i is the associated Henkin or witnessing constant. © 
We now set 
le=(Jlo (5’) 
a<t 


This is a set of formulas over Wff(V) that defines a theory Tp over Wff(V) (as 
the set of nonlogical axioms of {¢). We next define ~, on N this time, as before 
((6) on p. 66): 


nvm iff Fs,n=m 


We note its properties, and proceed to define a subset M of N as in (7) and (8) 
(p. 67). 

Since MCN, its cardinality is < t. After defining the M-restriction of 
a formula .4 as before, all the rest proceeds as in Lemmata I.5.28-1.5.33, 
replacing throughout Ty, cee and members i € N by &¢, Bd , and members 
i € N respectively. We are then able to state: 


1.5.40 Metatheorem (Consistency Theorem). /f a (first order) theory Z over 
a language L of cardinality € is consistent, then it has a model of cardinal- 
ity < & 


© Terminology: The cardinality of a model is that of its domain. © 


1.5.41 Corollary (Completeness Theorem). Jn any first order language L, 
Tl E.4impliesT + .4. 
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1.5.42 Corollary (Gédel-Mal’cev Compactness Theorem). Jn any first order 
language L, a set of formulas is satisfiable iff it is finitely satisfiable. 


The Léwenheim-Skolem theorem takes the following form: 


1.5.43 Corollary (Upward Léwenheim-Skolem Theorem). /f a set of for- 
mulas T over a language L of cardinality € has an infinite model, then it has a 
model of any cardinality n such that t < n. 


Proof. Let R = (K,.7) be an infinite model of I’. Pick a set N of cardinality n, 
and import its individuals c as new formal constants c into the language of I’. 
The set F = FU {-¢ =d:c 4d onN} is finitely satisfiable. This is because 
every finite subset of I involves only finitely many of the sentences ~¢ = d; 
thus there is capacity in K (as it is infinite) to extend .7 into .7’ (keeping K 
the same, but defining distinct Co” 4 a. etc.) to satisfy these sentences in an 
expanded structure R’ = (K,.7’). 

Hence I is consistent. 

Following the construction given earlier, we take this N (and I’) as our 
starting point to build a simply complete, consistent extension T,, of , and a 
model IN for P, with domain some subset M of N. The choice of M follows the 
definition of “~” (see pp. 66 and 73). Under the present circumstances, “~” is 
“=”"on N,forFa, ¢= d implies c = d on N (otherwise -c = d is an axiom — 
impossible, since {,, is consistent). Thus M = N; hence the cardinality of M 
is 1. 

The reduct of SJ on L is what we want. Of course, its cardinality is still right 
(M did not change). 


Good as the above proof may be, it relies on the particular proof of the 
consistency theorem, and this is a drawback. Hence we offer another proof that 
does not have this defect. 


Proof: (Once more) We develop a different argument, starting from the point 
where we concluded that I’ is consistent. Since the language of this theory has 
cardinality < n, 


Pause. Why? 


we know by 1.5.40 that 


we have a model IN = (M,.7) for T of cardinality <n (kk) 
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Define now a function f:N — M by f(n) = n’. Since all n€ N have 
been imported into the language of I’, f is total. f is also 1-1: Indeed, if c 4 d 
on N, then 7é = d is an axiom. Hence, (=¢ = d)” = t. That is, OF L@ 
on M. But then (by results on cardinality in volume 2) the cardinality n of N 
is < the cardinality of M. By yet another result about cardinality,! («*) and 
what we have just concluded imply that N and M have the same cardinality. 

At this point we take the reduct of 9 on L, exactly as in the previous 
proof. 


The above proof required more set theory. But it was independent of any 
knowledge of the proof of the consistency theorem. er er 


1.5.44 Remark (about Truth). The completeness theorem shows that the syn- 
tactic apparatus of a first order (formal) logic totally captures the semantic 
notion of truth, modulo the acceptance as true of any given assumptions I’. 
This justifies the habit of the mathematician (even the formalist: see Bourbaki 
(1966b, p. 21)) of saying — in the context of any given theory I — “it is true” 
(meaning “it is a '-theorem”, or “‘it is '-proved”), “it is false” (meaning “the 
negation is a ’-theorem’’), “assume that . 4 is true” (meaning “add the formula 
_@ — to T —as a nonlogical axiom’), and “assume that . 7 is false” (meaning 
“add the formula —. 4 — to T’ — as a nonlogical axiom”). 

Thus, “it is true” (in the context of a theory) means “it is true in all its 
models”, hence provable: a theorem. We will not use this particular argot. 

There is yet another argot use of “is true” (often said emphatically, “is really 
true”), meaning truth in some specific structure, the “intended model”. Due to 
Gédel’s first incompleteness theorem (Section I.9) this truth does not coincide 


with provability. & 
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On p. 38 we saw one way of generating theories, as the sets of theorems, Thmr, 
proved from some set of (nonlogical) axioms, I. Another way of generating 
theories is by taking all the formulas that are valid in some class of structures. 


+ That € <lAt<t— €= [for any cardinals € and [. 

= In axiomatic set theory (e.g., as this is developed in volume 2 of these lectures) the term “class” 
means a “collection” that is not necessarily a set, by virtue of its enormous size. Examples of 
such collections are that of all ordinal numbers, that of all cardinal numbers, that of all the objects 
that set theory talks about, and many others. In the present case we allow for a huge collection of 
structures — for example, ail structures — hence we have used the term “class”, rather than “set”, 
deliberately. 
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1.6.1 Definition. Let L be a first order language and % a class of structures 
appropriate for L. We define two symbols 

FE) =A: forall Me ¥, Kon 4} 
and 


Th(% ) = {.4 :@ is closed and, for all It € F, Eon -F} 


If & = {0} then we write.7 (MN) and Th(IN) rather than .7 ({M}) and Th({Ht}). 


For any class of structures, 7,.7(@ ) is a theory in the sense of p. 38. This 
follows from an easy verification of 


4ES(E) iff S(@)+.4 (1) 
We verify the if direction of (1), the opposite one being trivial. To prove .4 € 
T(E) we let M € ~ and prove 
Em 7% (2) 
By soundness, it suffices to prove Fon .7(@ ), Le., 
Em 2 forall .2eE.7(e) 
which holds by Definition 1.6.1. 


1.6.2 Example. For any structure 9% for a language L, .7(9N) is a complete 
theory: We want, for any sentence . 4, .7 (It) + .4 or .7(MN) - =%. By the 
previous observation, this translates to.4 € .7(QN) or (4) € .F(M). This 
holds, by 1.6.1. 


1.6.3 Example. Let .7% be the (enormous) class of all structures for some 
language L. Then.7(.%) is the set of all universally valid formulas over L. 


1.6.4 Definition. Suppose that Jt and A are two structures for a language L. 
If it happens that .7(9N) = .7 (A), then we say that St and K are elementarily 
equivalent and write IN = KR. 

The context will fend off any mistaking of the symbol for elementary equi- 
valence, =, for that for string equality (cf. 1.1.4). 


1.6.5 Proposition.‘ Let IN and & be two structures for L. Then, M = K iff 
Th) = Th(A). 


+ Normally we use the word “Proposition” for a theorem that we do not want to make a big fuss 
about. 
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Proof. The only-if part is immediate from 1.6.4. For the if part, let 4 be an 
arbitrary formula of L. Let. 4’ be its universal closure (cf. 1.4.11). Then Eon .Z 
iff Hoy 4’ and Ke .4 iff Kg .4’; thus Ko .4 iff Ke .4 (since Egy . 4’ iff 
Fa 6’). 


One way to obtain a structure that is elementarily equivalent to a stru- 
cture 9Jt is by a systematic renaming of all the members of the domain 
of SI. 


In what follows, if t = (M,.7) is a structure, |9Jt| is alternative notation for 

@ its domain M. Moreover, the interpretation of a language item a, f, or P will 
be denoted by an IM superscript, e.g., a”, rather than with an .7 superscript, 
as in a’. This alternative notation saves us from juggling far too many letters 
if we are discussing several structures at once (0, N, M, N,.7, ve T, n> €tC.; See 
also 1.5.2, p. 53). 


1.6.6 Definition (Embeddings, Isomorphisms, and Substructures). Let 0 
and & be structures for a language L, and @:|9It| > |A| be a total (that is, 
everywhere defined on |SJt}) and 1-1* function. @ is a structure embedding — in 
which case we write @ : It — K-— just in case ¢ preserves all the nonlogical 
interpretations. This means the following: 


(1) a®# = (a) for all constants a of L 

(2) f8(b(i1), ---. Pin) = OCF G1, ..., in)) for all n-ary functions f of L 
and alli; € |90| 

(3) P¥((i1),.--, OCin)) = P™ Gy, ..., i,) for all n-ary predicates P of L and 
alli; € || 


In the last two cases “=” is metamathematical, comparing members of || in 
the former and members of the truth set, {t, f}, in the latter. 

An important special case occurs when @ : |9It| — |A| is the inclusion map, 
that is, @(@) = i for all i € |9It| — which entails |] C |R|. We then say 
that Jt is a substructure of R and write St C K (note the absence of |... |). 
Symmetrically, we say that & is an extension of IN. 

If @ is onto as well — that is, (Vb € |R|)(a € |Mt|)d(a) = b — then we say 
that it is a structure isomorphism, in symbols IN= K, or just Mt = RK. 

It will be convenient to use the following abbreviation (Shoenfield (1967)): 
i? is short for #(i) for all i € |M. 


T That is, (a) = o(b) implies a = b for all a and b in |M. 
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There is no special significance in using the letter @ for the embedding or 
isomorphism, other than its being a symbol other than what we normally use 
(generically) for formal function symbols (f, g, h). 


® One way to visualize what is going on in Definition I.6.6 is to employ a so-called 
commutative diagram (for simplicity, for a unary /f) 


jm} si Ss f™ Elm 


o| {o 


IA] 3% 55 fFG%) € [8 


That the diagram commutes means that all possible compositions give the same 
result. Here o(f ™@) — FROM). 

When St C & and ¢ is the inclusion map, then the diagram above simply 
says that f”" is the restriction of f* on |M|, in symbols, f™ = f* | |MY. 
Condition (3) in Definition 1.6.6 simply says that P™” is the restriction of P* 
on |IN|", ie., P™ = P*® NM |M|". One often indicates this last equality by 


PT PF ||, ® 


1.6.7 Remark. Sometimes we have a structure It = (M,.7) for L anda 

1-1 correspondence (total, 1-1, and onto) ¢ : M — K for some set K. We 

can tum K into a structure for L, isomorphic to IN, by defining its “.7”’-part 

mimicking Definition 1.6.6. We say that @ induces an isomorphism MN + KR. 
We do this as follows: 


(i) We set a® = ¢(a™) for all constants a of L 
(ii) We set f*i,) = o( f (6-1), .... (in) for all n-ary function sym- 
bols f of L and alli; in K.t 
(iii) We set P&(7i,,) = P™(@-"(i,), ..., @'(i,)) for all n-ary predicate sym- 
bols P of L and alli; in K. 


It is now trivial to check via 1.6.6 that for the R so defined above, It > RK 
(Exercise 1.47). © 


The following shows that an embedding (and a fortiori an isomorphism) 
preserves meaning beyond that of the nonlogical symbols. 


+ By @—!(a), for a € K, we mean the unique b € M such that (b) = a. 
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1.6.8 Theorem. Let ¢ : It — K be an embedding of structures of a language 
L. Then: 


(1) For every term t of L whose variables are among X,, and for all i in ||, 
one has tlli?,...,i2 = @linD).! 

(2) For every atomic formula .4 of L whose variables are among Xn, and for 
alli, in |M|, one has Eon. 4llin] iff Ea Ali’, ..., i? T. 

(3) If ¢ is an isomorphism, then we can remove the restriction atomic from (2) 


above. 


Strictly speaking, we are supposed to apply “” to a (formal) formula (cf. 1.5.7). 

© “ Alli, nll” is an abbreviated notation for a “concrete” (already interpreted) sen- 
tence. However, it is useful to extend our argot to allow the notation “Foy 
Alind” — which says “the concrete sentence AlinI| is true in 9)’, in short, 
(Ali), ...,i,])”" = t (cf. 1.5.17) — as this notation readily discloses where .4 
was interpreted and therefore where it is claimed to be true. 


Proof. (1): We do induction on terms ¢ (cf. 1.5.17). If tf = a, a constant of L, 
then the left hand side is a* while the right hand side is o(a™). These (both in 
|.A|) are equal by 1.6.6. 


If t = x; (for some x; among the X,), then tli’, ees if] = 4 € |R| while 
tit, ...,¢n]] = i; € |Mt]. Thus (1) of the theorem statement is proved in this 
case. 


Finally, lett = ft,...t,. Then 


O(ft..-t MinD) = OCF, «5 t-llin D) (cf. 1.5.17) 
= f*(b(ilin),.... olin D) (by 1.6.6) 
= fi (allitec.ifvces Gli. 2) yD 
= (ft...4)[]i?,...,i%]] by 15.17) 

(2): We have two cases: 


Case where .4 = Pt,...t,. Then 


Em (Pt... t)llind . 
iff Eon P™ ind, sees tllinD) (by 15.17) 
iff Ke P*(o(nlinD,..-.¢GlinD) (by 1.6.6) 
iff Ee P*(n[i?,...,if],....6[if,....i2]) oy a) 
iff Ee (Ph...t)if,..., if] (by 1.5.17) 


* Recall 1.5.17. Of course, [lin] € |M), while r[fi?,..., 12] € [A]. 
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Case where.4=t=s. Then 


Eon (tf = s)linI iff Fon tli, = slin (5.17) 
iff Ke O(n) = OGslin1) (if by 1-1-ness) 
iff Ee t[li?,..., i] =s[if,...,é (by (1)) 
iff Ea @ =s)[if,...,i% (by 1.5.17) 


aaa {: 


(3): It is an easy exercise to see that if.4 and. have the property claimed, 
then so do =. 4 and. 4V .# without the additional assumption of ontoness. 
Ontoness helps (Ax). 4: 


Fon (Ax). Alin I ’ 
iff (Ga € |N|) Eon 4lla.in]) (by 15.17) 


iff (Ja € |M|) Fe. Alla’, te toage (by LH. on formulas) 
iff (ab € |R|) Ee Ab, ie wae i? (if by ontoness) 


iff Eg (Ax)-4)i?,...,i? (by 1.5.17) 


n 


1.6.9 Corollary. [f SN and K are structures for L and MN C K, then for all 
quantifier-free formulas .4 whose variables are among Xp, and for all in in 
[Nt], one has Fon -4[in] ff Ea Zin]. 


1.6.10 Corollary. Jf 9M and R are structures for L and IN 7 RK, then M = K. 
Proof. Let A(X,) be a formula. The sought 


Em An) iff Ee Xn) 
translates to 


(Wir € |M)... (Vin € (MI) Hm Alnd | 
iff (Vj €|A))... (Vin € [AI Ea ALI 


which is true,' by (3) of 1.6.8, since every j € || is an i? (for some i € |MI). 


1.6.11 Corollary. Let St and & be structures for L, M = &, and & a theory 
over L. Then Fon & iff Ea &. 


Condition (2) in 1.6.8 is quite strong, as the following shows. 


+ Since we are arguing in the metatheory, we can say “true” rather than “provable”. 
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1.6.12 Theorem. Let 3M and & be structures for L, and d : |IN| > |R| be a 
total function. If for every atomic 4 that contains at most the variables X, and 
for alli, in |9N| one has 


Fo lind if a Ali... iT] (1) 
then & : IN > Kis an embedding. 


Proof. First off, ¢ is 1-1. Indeed, since x = y is atomic, then, by (1) above, for 
alli and j in|IM|, Eq i = j iff Eg i? = j%.' The if direction gives 1-1-ness. 


We now check the three conditions of Definition I.6.6. 


Let next a™ = i. Using the atomic formula a = x, we get (by (1)) 
Em(a= xi] iff Fe @=x)li*] 
that is, 
Ema =i iff Kga*=i? 


Since a™ = i is true, so is a® = i%. That is, a* = o(a™). (This part only 
needed the only-if direction of (1).) 


Let now f be n-ary. Consider the atomic formula f(X,) = X)41. By (1), 
Eo (fn) = Xnelinei iff Ka (fGn) = xn)... ei] 
That is, 
Ear fn) =inr if Fe fFA(f,..,) = 718, 

Thus f*(i?,...,i%) = o(f™,)). (This part too only needed the only-if 
direction of (1).) 

Finally, let P be n-ary. By (1) 

Em PG lind iff Fa PG)fi,..., i] 

That is, 


Eo P™G,) if Ee PR(i?,...,i%) 


1.6.13 Corollary. Let SN and & be structures for L, and @ : |IN| > |K| be a 
total function. If, for every atomic and negated atomic .4 that contains at most 
the variables x, and for alli, in |IN\| one has 


Eo. 4llin] implies -e.Ai?,...,i% 
then @ : IN — Kis an embedding. 


to@sy)l, jlsisj. 
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Proof. Let Eg. Alli’, ...,i2],. 4 atomic. If kn. Z[li, I, then Eon —. Alli, I. 
—.4 is negated atomic. Thus, Fg = ALi?®, ene if], Le., Ra ALie, b dag if], 
contradicting the assumption. The “implies” has now been promoted to “iff”. 


1.6.14 Corollary. Let IN and K be structures for L, and |SN| C ||. If for 
every atomic and negated atomic .4 that contains at most the variables X, and 
for alli, in |9N| one has 


Lom. 4[lin] implies Ke. Allin] 


then MN C KR. 


Proof. |9N| C |&| means that we may take here @ : |NIt| > || to be the inclu- 
sion map. 


The converses of the above two corollaries hold (these, essentially, are 1.6.8 


and 1.6.9). © 


Moreover, these corollaries lead to 


1.6.15 Definition (Elementary Embeddings and Substructures). Let 9% and 
K be structures for L, and @ : |QN| > |A| a total function. If, for every formula 
_@ that contains at most the variables x, and for all i, in |9Jt|, one has 


Em 4llin]) implies Fe.Ai?,...,i% (1) 


then @ : Nt > Kis called an elementary embedding. We write @ : MN >. K 
in this case. 


If ¢ is the inclusion map |9Jt| C |], then (1) becomes 
Eo Zin] implies 9 Alin] (2) 


In this case we say that 90 is an elementary substructure of K, or that the latter 
is an elementary extension of the former. In symbols, II ~ RK. 


1.6.16 Remark. The “implies” in (1) and (2) in the definition above is promoted 
to “iff” as in the proof of 1.6.13. 


By 1.6.13 and 1.6.14, an elementary embedding ¢ : IN > ~ K is also an 
embedding @ : 93 — RK, and an elementary substructure 9 ~< RK is also a 
substructure It C KR. 
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If @ : Nt >. K, then condition (1) in Definition I.6.15 (which, as we have 
already noted, is an equivalence) yields, for all sentences .4 over L, 


Lene ate ee ot 


By 1.6.5, Nt = K. In particular, Nt ~ K implies Wt = K. © 


1.6.17 Remark. In many arguments in model theory, starting with a structure It 
for some language L, one constructs another structure & for L and an embedding 
go : St — K or an elementary embedding ¢ : Nt > K. More often than not 
one would rather have a structure extension or an elementary extension of Jt 
(i.e., preferring that @ be the inclusion map) respectively. This can be easily 
obtained as follows: 


Let Nt = (M,...) and KR = (K,...). Let N be a set disjoint from M,* of 
cardinality equal to that of K — ¢[M].* Thus there is a 1-1 correspondence 
w:K —¢[M] — N, from which we can build a 1-1 correspondence x : K > 
M UN by defining 


wx) ifx e K —¢[M] 


os peste if x € 6[M] 


Using Remark I.6.7, we get a structure R’ = (M UN,...) such that 
R= SK’ () 
x 
We verify that if we had @ : Nt >. K initially, then we now have! 
xo: MH, K' (2) 


Well, let 4 be arbitrary over L with free variables among x,. Pick any di in M, 
and assume Fox Allin]. By hypothesis on @ and 1.6.15, Eg -4[@(i1), ..., 
(in). By 1.6.8(3) and (1) above, Fg -4[x(6(i1)), ..-, X(P(in)) I]. This set- 
tles (2), by I.6.15. By definition of x above, if x € M, then, since d(x) € ¢[M], 
we have x(¢(x)) = @7'(@(x)) = x. That is, (2) is an elementary extension 
(x © ¢ is the identity on M). 


The alternative initial result, an embedding ¢ : Dt — &, yields a struc- 
ture extension x og : Wt C RK’, since the composition of embeddings is an 
embedding (Exercise I.48). © 


1 NOM=S. 
i OLX] = {O(x) : x EX}. X-Y = {x eX: x EY}. 
8 y o@ denotes function composition. That is, (x 0 #) (x) = x((x)) for all x € M. 
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The following criterion for elementary extensions evaluates truth in the big- 
ger of the two structures. It is useful, among other places, in the proof of the 
downward Léwenheim-Skolem theorem (1.6.20 below). 


1.6.18 Proposition. Let SI and & be structures for L. If IN © K, and moreover, 
for every formula 4 whose free variables are among y and Xp, and for all by, 
in |QN|, 


(da € |R|) Eq. 4lla, bn] implies (Aa € |M|) Ee. Alla,b,] ) 


then MN ~ RK. 


Proof. We need to prove (2) of 1.6.15 using (3) above. 

We do induction on formulas. We have already remarked in 1.6.16 that the 
definition is an “iff”, and it is more convenient to carry the induction through 
with an “iff” rather than an “implies”. 

First off, by 99 C & and Corollary 1.6.9, we are done for all quantifier-free 
formulas. This takes care of atomic formulas. 

For the induction step, from .7 and % to —.7 and.# Vv @, we note (with i 
arbitrary, in |9Jt|) 


Em — i] iff Kom AU 
iff ke Ziil] by LH. 
iff Ke ~All 


Similarly, 
Lon (BV) iff Eo Zi or Ex £0 


iff Eg ZliJorEe lil by LH. 
iff Ee (@V ML 


It only remains to check existential formulas. The implication 
Em ((Ax).4)l] implies Fe (Ax).4) (I 
that is, 
(da € |N|) Em -Zlla,i]) implies (Ga € |A]) Ke Zila, i] 


being trivial by |9Jt] C |], we check the opposite direction. Let i still be in 
[9], and (da € |R|) Fe Alla, i]. By (3), (da € |M)) Ee lla, i]. By LH., 
(da € |MI) Fm Alla, i]. 
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1.6.19 Example. The structure Tt’ = (N, <, 0) is areduct of the standard model 
of arithmetic. It is essentially a linearly ordered set, where we have also focused 
on a “distinguished element”, 0, that also happens to be the minimum element of 
the order. Any nonempty B C N that contains 0 leads toa B = (B, <, 0) CW, 
trivially. 

If we ask for more, that 8% ~< St’, then we reduce our choices (of B that 
work) drastically. Indeed, let. A(x, y) say that x and y are consecutive, x < y. 
That is,.A(x, y) is 


x <yAr7(Az)\(x <zAz<y) 
Then 
(Va € N) Fov (Ay). 4, y)) la] (1) 


Thus, if B < ‘’, then we require B C Yt’ — in particular, 0 € B — and 
(from (1), 1.6.15 and I.6.16) 


(Va € B)(Ab € B) Fx .4[[a, bd] (2) 


0 € B and (2) yield (by metamathematical induction) that B = N. 
The following is a useful tool in model theory. 


1.6.20 Theorem (Downward Léwenheim-Skolem Theorem). Assume that L 
has cardinality at most m (that is, the set of nonlogical symbols does‘), while 
the structure MN = (M, .7) (for L) has cardinality at least m. Let X C M, and 
the cardinality of X be < m. Then there is a structure & for L, of cardinality 
exactly m, that satisfies X C |R| and R ~ M. 


This proof requires some facility with cardinal manipulation. In particular, we 

O® ice facts such as that, if m is an infinite cardinal and n € N while No is the 
cardinality of N, then ( + 1)-m =Xo-m= m”*+! — m, where “.” denotes 
cardinal multiplication. 


Proof. (Thinking out loud.) If R = (K, ...) — still to be defined — is to satisfy 
KR ~< Mt, then we have to ensure (1.6.18), among other things, that for every 
Ay, Xn) over L and for all a, in K, if 4b, ay] is true in SN (where b € M), 


~ 


then some D’ € K exists such that. 4Z[[b’, a, ]] is also true in 2M. 


¥ [tis a set-theoretical fact that the union of a countable set and an infinite set has cardinality equal 
to that of the infinite set, or No + m= m. Recall that the set of logical symbols is countable. 
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In short, we need to obtain K as a subset of M that includes X and is closed 
under all the relations 


O.4 = {(Gn,b): Fo (Ay, Xn LD, Gn} () 
where b is the output part in Q_z above. 
It is prudent to take two precautions: 


First, in order to get the smallest cardinal possible for K, we build the C- 
smallest set described in italics above, that is, we take K = C1(X,.#),' where 
Ze is the set of all the relations Q 7 as .4 varies over all the formulas over L. 

Second, in order to place a lower bound of m to the cardinality of K, we start, 
rather than with X, witha set Y of cardinality exactly m such that X¥ C Y C M. 
Such a Y exists by the size estimates for X and M. In other words, we have just 
changed our mind, and now let K = Cl(Y,.%). 


Actually, a bit more prudence is in order. Let us rename the Cl(Y, .#) above 
K’. We are still looking for a K that we will keep. 


We will “cut down” each Q_z in (1) — with the help of the axiom of choice 
(AC) — making the relation single-valued' in its output, b. (End of thinking out 
loud.) 


We define 
O, = {(G@n, b) :b picked by AC in {x : Q.4 (Gn, x)} if (Ax € M)O.7 (Gn, x)} 
(2) 
Let now .# be the set of all O. z,and set K = Cl(Y, R ). This one we keep! 


First off, trivially, K C M.To turn K into a substructure of Jt, we have to 
interpret every f,a, P — function, constant and predicate, respectively — of L 
as f™ |} K” (n-ary case), leave as a™, and P™|K (n-ary case), respectively. 

For functions and constants we have more work: For the former we need to 
show that K is closed under all f™" | K”. Let then f be n-ary in L and G,, be in 
K.Then f"(a,) = b and b € M. We want b € K. 

Well, if we set.4 = f(Xn) = y, then Eon (f(%n) = y)Gn, bY]; hence 
O.2 (Gn, b) is true (cf. (1)), and therefore O, (Gn, b) is true, for b is unique, 
given the d,, so that it must be what AC picks in (2). Thus, since K is 
a) z-Cclosed, b € K. Similarly one proves the case for constants a of L, us- 
ing the formula a = y. 


Thus we do have a substructure, R = (K,...), of DM. 


It remains to settle the property of Proposition I.6.18 and the cardinality 
claims. 


+ You may want to review Section I.2 at this point. 
* A fancy term for “function”. This function need not be defined everywhere on M. 
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First off, let Koy .4[[b, a,]], where b € M, and the a, are in K. For some 
b’ picked by AC in (2), O + (Gy, b’) is true, that is, En . 4D’, a, J]. Since K is 
Q ,-closed, b’ € K. Thus, the <-criterion for R < Jt is met. 

Finally, we compute the cardinality of K. This inductively defined set is 
being built by stages (see 1.2.9), 


K=|(JK; (3) 
i>0 
At stage 0 we have Ky = Y. Having built K;, i < n, we build K,,4; by appending 
to Uj<, Ki all b such that Q ,(G,,b) holds. We do so for all Q. 4 (%,, y) and 
all possible a, in U);<, Ki. 
We show that the cardinality of each K,, is < m. This is true for Ko. We take 
an I.H. for all n < k and consider K,+, next. 


Now, the set of formulas over L has cardinality < m; hence 
the set.# as well has cardinality < m (4) 


We let S = LU; -; K; for convenience. By I.H., 


i<k 
(cardinality of S)< (k+1)-m=m (5) 


For each 0, (X41) € R , the cardinality of the set of contributed outputs, 
taking inputs in S, is at most equal to the cardinality of S’, i.e., < m’ = m. The 
total contribution of all the O zis then, by (4), of cardinality < m-m = m. 
Thus, using (5), the cardinality of Ky.) is < m+ m =m. By (3), the cardinal 
of K is < 8%): m= mM. Since it is also > m (by Y C K), we are done. 


N.B. The set K’ that we have discarded earlier also satisfies R’ = (K’,...) < 
SM, and has cardinality at most that of M. We cannot expect though that it has 


cardinality < m. S & 


1.6.21 Corollary. Assume that L has cardinality at most m, while the structure 
M = (M,.7) (for L) has cardinality at least m. Let X < SN, and the cardinality 
of |X| be < m. Then there is a structure & for L, of cardinality exactly m that 
satisfies X < KR < M. 


Proof. Working with X = |X| exactly as in 1.6.20, we get &. Itis straightforward 
to check that X C &.' The <-criterion (1.6.18) for X < & is: Given. 4 with free 


¥ For n-ary f in L, f* = f™ | X” by assumption. On the other hand, f* = f™ [ K” by 
construction of &. But X C K. 
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variables among the y and X,, and a, in X. Verify that if 


forsomebe K, [-eg-.4[b,a,]] (1) 
then 
Ke 4b’, a] for some b’! € X (2) 


Well, & C IM and (1) yield Egy . Alb, ap]. (2) follows from X ~ IM. 


The following definition builds on 1.5.4 and 1.5.20. 


1.6.22 Definition (Diagrams). Given a language L and a structure 2 for L. Let 
OAM C ||. We call L(M), constructed as in 1.5.20, the diagram language of 
M. We denote by 2lj the expansion of 2 that has all else the same as 2, except 
that it gives the “natural meaning” to the new constants, 7, of L(M). Namely, 
pe i for alli € M. Ay is called the diagram expansion of A (over L(M)). 


We often write Ay = {/A|,..., jem} to indicate that constants 7 with 
interpretations i where added. The part “‘{|2l|, ... }’’ is just Ql. 


If M = |Q|, then we write L(Q0) for L(M), as in 1.5.4. 

The basic diagram of 2, denoted by D(2l), is the set of all atomic and 
negated atomic sentences of L(2) that are true in 2). 

The elementary diagram of 2 is just Th(2)9q)). 

Suppose now that 2( and M are as above, % is a structure for L as well — 
possibly 21 = % — and @: M — || is a total function. We may now wish to 
import the constants i of M into L, as i, and expand 8% so that, foreachi € M, 
i is interpreted as $(i). This expansion of 8 is denoted by Bgrmj.! 

Thus, all else in Bg,] is the same as in B, except that co = (i). 


We often write Bgv) = {|Bl,...,(@(@))iem} to indicate that constants i 
with interpretations (i) were added. 


Therefore, if M C |%8| and ¢ is the inclusion map, then Bgysj is the expan- 
sion of 8, By. 


The particular way that we have approached the assignment of semantics is not 
the “usual” one.* One often sees a manner of handling substitution of informal 
constants into formal variables that differs from that of 1.5.4. Namely, other 
authors often do this by assigning (or “corresponding’’) values to variables, 
rather than substituting (formal names of) values into variables. This is achieved 
by a so-called assignment function, s, from the set of all variables {vo, v;,... } 


+ In the choice of our notation we are following Keisler (1978). 
t Tt is the same as the one in Shoenfield (1967). 
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to the domain, |2l|, of the structure into which we want to interpret a language 
L. Intuitively, s(v;) = a says that we have plugged the value a into v;. 

While we have not favoured this approach, we will grant it one thing: It does 
not compel us to extend the language L into the diagram language L(2l) to 
effect the interpretation. Such extensions may be slightly confusing when we 
deal with diagrams. 

For example, let L and 2 be as above, with @ # M C |2l|. We next form 
L(M). Now, this language has new constants, i, for each i € M. Thus, when 
we proceed to interpret it into 2ly, we already have formal counterparts of all 
i € Min L(M), and we need only import the constants (as per 1.5.4) of |2l] — M. 

But is there any harm done if we re-import alli € |2l|, asi, to form L(M)(20)? 
Not really. Let us focus on ani € M which has been imported as i when forming 
L(M) and then again as i to do what 1.5.4 prescribes towards interpreting L(M) 
in Ay. Thus, i is a closed term t of L(M)(2) for which r%™ = i. Since 7 is the 
imported name for i, Lemma 1.5.11 yields that, for any term s and formula .4 
over L(M)(2l) of at most one free variable x, 


(sfx —i])™ =(s[x <i)™ 
and 
(Alx << i))™ = (4 — ip™ 


In short, meaning does not depend on which formal alias of i ¢ M we use. We 
will be free to assume then that the 7 were not new (for i € M); they were just 
the i. 

In particular, the statement “the sentence _74 over L(Q\) is true in Ajgy” is 
equivalent to the statement “the 2-instance' .4 of a formula over L is true 
in 2%”, since a closed formula of L(2l) is an 2-instance of one over L and 


conversely G =i). 


Diagrams offer more than one would expect from just nomenclature, as a 
number of applications in the balance of this section will readily demonstrate. 
First, we translate 1.6.13, 1.6.14, and 1.6.15 into the diagram terminology. 


1.6.23 Lemma (Main Diagram Lemma). Let It and & be two structures for 
L, and @ : |IN| > |&| be a total function. Then 


(1) @: M > RK (embedding) iff F xy jo,,, DOD; 
(2) 6: M —>~ K (elementary embedding) iff =s, 
Moreover, if \IN| C ||, then 


ThOMN on). 


(| 91] 


¥ Cf. 5.7, p. 55. 
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(3) Dt C RK (substructure) iff F &,.,, DON); 
(4) IM ~< K (elementary substructure) iff = Go, ThON mm). 


Proof. Direct translation of the facts quoted prior to the lemma statement. For 
example, the hypothesis of the if part in item (1) above is the same as the 
displayed formula in the statement of 1.6.13 if we recall that, e.g., an atomic 
sentence over L(M) is an Mt-instance . A(ij,..., in) of an atomic formula .4 
over L and that Fox Alin is argot for (4G, sees as) = t (cf. 1.5.17). 
The only-if part in (1)-(4) is due to the remark following 1.6.14, p. 82. 


We present a number of applications of diagrams. First, we revisit the upward 
Loéwenheim-Skolem theorem (1.5.43). 


1.6.24 Theorem (Upward Léwenheim-Skolem Theorem, Version 2). Let 2 
be an infinite structure for L. For every cardinal uw that is not smaller than 
the cardinal of 2X (that is, of |2\|) and of L, there is a structure 8 for L, of 
cardinality exactly n, such that XL ~ B. 


Proof. As in the proof of 1.5.43, pick a set N of cardinality n. The set of 
sentences over L(21)(V) 


Q = Th(Ajqy) U{-E =d:c4#donN} 


where ¢ are distinct new constants, one for each c € N, is finitely satisfiable. 
This is because every finite subset of @ involves only finitely many of the 
sentences =¢ = d, and thus there is capacity in 2 = (A,.7) (as A is infinite) 
to extend .7 into .7’ (keeping A the same, but defining distinct @”', d” , etc.) 
to satisfy these sentences in an expanded structure 2l' = (A,.7’). 


By compactness, there is a model D = (D, .7,) for 7 (1) 


We will look at first into its reduct on L(2), € = (D, 7) — that is, 7, = 
Ty |} L(A). € is a model of Th( 2a). 

We define next a function f:A — D by f (n”) = n’¢. The function f 
is total because all n € A have been imported, as 7, in the language L(Ql) of 
Th(2 ji). 

Thus, if €’ = (D, .7,,) is the reduct of € on L — that is,.7,, =.7, | L—then 
the expansion Cr a of €’ is just €, hence, Fe: ‘ca Th(2l,). By the main diagram 
lemma above, 


f:Ao (2) 
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Without loss of generality, we may assume in (2) that A C D (see Remark I.6.17) 
and that f is the inclusion map (i? = 7c), That is, 


A« eC’ 


Thus, it remains to settle the cardinality claims. First off, since c 4 d in N 
implies that ~¢ = d is true in D by (1), it follows that the cardinality of D 
is > n.' That is, the cardinality of €’ (that is, of its domain D) is > n. 

By the downward Lowenheim-Skolem theorem (in the form I.6.21) there is 
a structure 8 of cardinality exactly n such that 2 <~ B<« @. 


Theorem 1.6.24 provides different information than its counterpart, 1.5.43. The 
former ignores axioms and works with a language and its structures. In the pro- 
cess it gives a strong type of extension between the given structure and the 
constructed structure. The latter downplays structures and is about a language, 
a set of axioms, and the size of the models that we can build for those axioms. 
Version 1.5.43 has the important consequence that theories over countable 
languages that have any infinite model at all — such as ZFC — must also have 
models of any infinite cardinality. In particular they must have countable models. er 


We conclude the section with a further sampling of applications. 


1.6.25 Definition. We call a theory .7 open iff it is the set of theorems of a set of 
axioms I that consists entirely of open formulas, i.e., quantifier-free formulas. 
Such is, for example, group theory and ROB arithmetic (1.9.32). Such theories 
are also called universal, since one can generate them with a different I’. The 


latter’s formulas are all the universal closures of open formulas, those in I’. 


Thus, by the completeness theorem, a theory .7 is open iff there is a set of open 
formulas, I’, such that .7 and I’ have the same models. © 


We have the following model-theoretic result for open theories: 


1.6.26 Theorem (Los-Tarski). A theory .7 (in the sense of p. 38) is open iff 
every substructure of every model of 7 is also a model. 


Proof. Only-if part: Exercise 1.53. 


If part: All this is over some language L. Let IT’ be the set of all open theorems 
of .7 (over L).t We need to show that.7 = Thmr, or, for every structure IN 


¥ The function N 3 nt #7 € Dis total and 1-1. 
¥ According to the definition of a theory on p. 38, in particular the displayed formula (1) there, 


Tr 


“open theorems of .7” is synonymous with “open formulas in .7 
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for L, 
Eo TF implies Foy .7 (1) 


To this end, we assume the hypothesis in (1) and consider D(It) U.7. We 
claim that this is a consistent set of formulas over L(t). Otherwise .7 U 
{.4,,...,.-4n} - ax = x, where .4),...,.%4, is a finite set of sentences of 
D(ON). By the deduction theorem 


FErAX=HXVAALV>+ + VA, 
Hence (tautological implication and logical axiom x = x) 
TF RAALV ++ Vay (2) 
By the theorem on constants, (1.4.15) — since .7 is over L — (2) yields 
FAB V+ VA, 


where =. 4, V---V 7.4, is obtained from 7.4, V--- V7.4, by replacing all 
the imported constants i, j,... which appear in the latter formula by distinct 
new variables. Thus, 7.4, V --- V 7.4, is in T hence is valid in 9. That is, 
assuming the formula’s variables are among j,, for all a, in |S, 


Em (2.4, V-+-V 7.4, lla, I] (3) 
We can thus choose the a, so that.4; = (4, Mar] for j = 1,...,n. Since 


(3) entails that (4 pla is falset in Mtjgn; for at least one j, this contradicts 
that all the .4; are in D(QN). 

Thus, by the consistency theorem, there is a model R for D(N) U.7. 
Without loss of generality (cf. Remark 1.6.17), the mapping i b> 7” is the 
identity. If we now call 2 the reduct of A on L, then the expansion 27; of 2l 
is R; therefore :y,,,, D(20). The diagram lemma implies that 92 C 2. Since 
2 is a model of .7, then so is Nt by hypothesis. This settles (1). 


1.6.27 Example. Group theory is usually formulated in a language that employs 
(unary function) and “1” 


66D 


the nonlogical symbols “o” (binary function), 
(constant). Its axioms are open, the following: 


6o—] 


(1) xo(yoz)=(roy)oz'; 
(2) xol=lox =x (using “=” conjunctionally); 


1 


(3) xox7t'=x7lox=1. 


¥ Cf. discussion on p. 89. 
} Where we are using the habitual infix notation, “x o y”, rather than oxy oro (x, y). 
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The models of the above axioms are called groups. Over this language, every 
substructure of a group, that is, every subset that includes (the interpretation 
of) 1 and is closed under o and ~, is itself a group, as the only-if direction of 
the above theorem guarantees. 

It is possible to formulate group theory in a language that has only “o” and 
“1” as nonlogical symbols. Its axioms are not open: 


(Gi) xo(yoz)=(xoy)oz; 
ii) xol=lox=x; 

(iii) Ay)xoy=1; 

(iv) Gy)yox = 1. 


Since the Los-Tarski theorem is an “iff” theorem, it must be that group theory 
so formulated has models (still called groups) that have substructures that are 


not groups (they are semigroups with unit, however). 


1.6.28 Definition. Constructions in model theory often result in increasing 
sequences of structures, either of the type 


Mo SM CS Mh C-:- () 
or of the type 
My < Nt, <~ Mh «>: (2) 


Sequences of type (1) are called increasing chains or ascending chains of 
structures. Those of type (2) are called elementary chains. 


Chains are related to theories that can be axiomatized using exclusively exis- 
tential axioms, that is, axioms of the form (4x,)...(4x,).4, for n > 0,' where 
_@ is open. Such theories are called inductive. By taking the universal closures 
of existential axioms we obtain formulas of the form (Vy,)...(Wy,)(Ax1)... 
(Ax, ).4. For this reason, inductive theories are also called V3-theories. 


The following result is easy to prove, and is left as an exercise. 


1.6.29 Theorem (Tarski’s Lemma). The union IN of an elementary chain (2) 
above is an elementary extension of every Nj. 


+ n = 0 means that the prefix (4x,)...(Ax,) is missing. Thus an open formula is a special case 
of an existential one. As a matter of fact, we can force a nonempty existential prefix on an open 
formula: If z does not occur in. Z, then we can be prove. 4 < (4z).4 without nonlogical axioms. 
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Proof. First off, a chain of type (2) is also of type (1); thus, 99; C 9t;41 for all 
i > 0. Let M and M; denote the universes of It and IN; respectively. Then by 
“union of the 9t;” we understand that we have taken M = (J i>0 M; and 


(i) P™ = Uj.) P”™, for each predicate P, 
Gi) f™ = Uso f™ for each function f (note that f™™ Cc f+! for all 
i > 0), and 
(iii) a® = a®™°, for each constant. 


The real work is delegated to Exercise [.54. 


1.6.30 Theorem (Chang-Los-Suszko). The union of every type-(1) chain 
(1.6.28) of models of a theory 7 (the latter in the sense of p. 38) is a model of 
TS iff the theory is inductive. 


Proof. The proof is standard (e.g., Chang and Keisler (1973), Keisler (1978), 
Shoenfied (1967)). 

If part: Delegated to Exercise I.55. 

Only-if part: Assume the hypothesis for a theory .7 over L. We will prove 
that it is inductive. To this end, let I by the set of all existential consequences 
of 7 —i.e., formulas (Ax,)...(Ax,).4-—n > 0-with.4 open. As in the proof 
of 1.6.26, we endeavour to prove 


Eo TF implies Foy .7 (1) 
Assume the hypothesis in (1), and let Dy(9Jt) denote the set of all universal 


sentences' over L(S)t) that are true in Jt, where we have written M for ||. 
We argue that Dy(9t) U.Y is consistent. If not, .7 - ax = xV-7 
.4,V+-++V 34,, for some .4 € Dy(3); hence (notation as in the proof 
of 1.6.26) 
F-A4, A+++ A 4A,) 
Since + .4 < (Vz).4 if z is not free in .4, and V distributes over A (see 
Exercise I.23), we can rewrite the above as 


TF EAVE MB, A A B,) (2) 


where the.# , are open over L, and (Vx,) is short for (Vx,)...(Wx,). The formula 
in (2) is (logically equivalent to one that is) existential. Thus, it is in I’, and 
hence is true in 9Jt, an impossibility, since no. 4; € Dy(QN) can be false in Wty. 


1 A universal formula has the form (Wx1)...(WX,).4,n > 0, where 4 is open. Thus (n = 0) every 
open formula is also universal, as well as existential. 


<4 
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We can now have a model, &’ = (K,..., (d)aem), of Dy) U.Z, where 
“(a)aem’ indicates the special status of the constants a ¢ M C K: These have 
been added to L, as a, to form L(9N). If R = (K, ...) is the reduct of R’ to L, 
then R’ = Ry; hence 


Fa, Dy) (3) 
Since DN) C Dy(Q, we obtain 
MCR (1.6.23), and FEeZ (7 is over L) (4) 


Remark I.6.17 was invoked without notice in assuming that the a of L(9M) are 
interpreted in R’ exactly as in Nty: as a. 


We next argue that the theory .“” = D(R) U Th(2My) over! L(K) is consis- 
tent, where D(K) is as in I.6.22, i.e.,the set of all atomic and negated atomic 
sentences over L(K) that are true in Rx = (K,..., (d)aem, (A)acx—m). Note 
that L(K) is obtained by adding to L(M) the constants (@)gex—m, each @ in- 
terpreted as a in Rx. That is, a € M were not re-imported as something other 
than the original a of L(M). 

Now, if .” is not consistent, then some .4; € D(R) (i = 1,..., 7) jointly 
with Th(Qy) prove —x = x. Since the .4; are sentences (over L(K)), the 
usual technique yields 


Let —.4/,V---V 7.4! be obtained from —.4; V--- V7.4, by replacing all its 
constants a, b,... — wherea,b,... arein K — M— by distinct new variables 
x,. Thus (1.4.15) 


Th(Mty) + (WxX,(A4) Vi+- VA!) (5) 
Since (Vx,)(-. 4) V +++ V 7.4',) is a sentence over L(M), (5) implies that it is 
in Thy), i.e., true in ty. Thus, being universal, it is in Dy(N). By (3) it is 
true in Ry. In particular, its Ry-instance (cf. 1.5.7) 7.4, V--- V 7.4, -—a 
sentence of L(K) — is true in Ry in the sense of I.5.7, and hence also in Rx. 


This is impossible, since .4; € D(R) implies that no sentence . 4; can be false 
in Rr. 


Thus, there is a model 2 = (A, ...) of .Y over L(K ). Now, on one hand we 
have 


Fa D(R) (6) 


+ The part Th(QNy) is over L(M), of course. 
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and we may assume, without loss of generality (cf. 1.6.17), that 


KR 


forae K, @*t @* is the inclusion map (7) 


Therefore, K C A, and we may write 2 a bit more explicitly as 
A= (A,..., Maem, Mack—m) (8) 


Let us call St’ = (A,...) the reduct of 21 on L. By (8), 2 = Mt’x, so that 
(6) reads Fon’, D(S&); hence (by 1.6.23) 


ROM (9) 


On the other hand, every sentence of Th(ty) — a subtheory of .W — is true in 
2. The relevant part of 2 is its reduct on L(M), namely, Wt’ y. Thus, Fo’, 
Th(ty), and therefore 


mM ~« mM’ (10) 


After all this work we concentrate on the core of what we have obtained: (4), 
(9), and (10). By (10) and remarks in I.6.16 we have that Foy, I’. Thus we can 
repeat our construction all over again, using Vt’ in the place of I, to obtain 
R' DM with Ee .7, and also a MN” that is an elementary extension of St’. 

In short, we can build, by induction on n, an alternating chain 


Mo C Ko SM, CA C--- (11) 
such that 
Moy < My ~--- (12) 
and 
Fe, Z% for n>0 (13) 


where No = M, and, forn > 0, RK, and Mt,41 are obtained as “R” and “Mt” 
respectively from 9Jt,,, the latter posing as the “SJ” in the above construction. 
By assumption and (13), the union of the alternating chain (11) is a model, 8, 
of 7 . Since % also equals the union of the chain (12), Tarski’s lemma (1.6.29) 
yields IN ~ B. Hence Foy .7, proving (1). 


Some issues on the cardinality of models are addressed by the upward and 
downward Léwenheim-Skolem theorems. Here we sample the connection be- 
tween possible uniqueness of models of a given cardinality and completeness. 
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As before, a theory .7 over some language L is (simply) complete iff for all 
sentences. 4%, one of. 7 | .Zor.7 + — Gholds. If.7 is in the sense of p. 38, 
then it is complete iff for all sentences .4, one of .4 € .Y or (7.4) € F&F 
holds. 


1.6.31 Definition. A theory.7 over a language L is x-categorical (where x > 
cardinality of L) iff any two models of .7 of cardinality « are isomorphic. In 
essence, there is only one model of cardinality x. 


1.6.32 Theorem (Los-Vaught Test). [f.7 over L has no finite models and is 
«-categorical for some k, then it is complete. 


Proof. The hypothesis is vacuously satisfied by an inconsistent theory. If .7 
is inconsistent, then it is complete. Let it be consistent then, and assume that 
there is some sentence .4 that is neither provable nor refutable. Then both 
ZF O{4} and.7 U {-.4} are consistent. Let 2( and B be respective models. 
By the upward Léwenheim-Skolem theorem there are structures 2’ and 8’ of 
cardinality « each such that 2 < 2’ and & < %’; hence (by assumption and 
remarks in I.6.16) 


A= A =P’ =H 


Thus . 4 is true in $B (since it is so in 2(), which is absurd. 


Our last application is perhaps the most startling. It is the rigorous intro- 
duction of Leibniz’s infinitesimals, that is, “quantities” that are non-zero,’ yet 
their absolute values are less than every positive real number. Infinitesimals 
form the core of a user-friendly (sans ¢-6, that is) introduction to limits and 
the differential and integral calculus — in the manner that Leibniz conceived 
calculus (but he was never able to prove that infinitesimals “existed”’). Such 
approaches to learning calculus (sometimes under the title non-standard analy- 
sis) have been accessible to the undergraduate student for quite some time now 
(e.g., Henle and Kleinberg (1980); see also Keisler (1976)). The legitimization 
of infinitesimals was first announced in 1961 by Abraham Robinson (1961, 
1966). Besides making calculus more “natural”, non-standard analysis has also 
been responsible for the discovery of new results. 

All that we need to do now is to extend the standard structure for the reals, 
SR = (R,...) — where ‘‘...” includes “+”, “x”, “<”, “0”, “1”, etc. — so that 
it is enriched with “new” numbers, including infinitesimals. Given the tools at 
our disposal, this will be surprisingly easy: 


+ Actually, 0 is an infinitesimal, as it will turn out. But there are uncountably many that are non-zero. 
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Thus, we start by extending the first order language for the reals, by importing 
all constants, all functions, and all relations over R into the language as formal 
constants, functions, and predicates respectively. 


There are two issues we want to be clear about. 


(1) In the exposition that follows we will almost exclusively use (an argot 
of) the formal notation for the extended language of the reals — and, in- 
frequently, informal notations for concrete objects (e.g., reals). We are 
thus best advised to keep the formal language notation uncomplicated, and 
complicate — if we must — only the notation for the concrete objects. Thus, 
e.g., we will denote by “<” the formal, and by “° <” the concrete, predicate 
on the reals; by “s/2” the formal, and by “°./2” the concrete constant. In 
short, we import a, f, P for all informal °a € R, for all functions ° f with 
inputs and outputs in R, and for all relations °P on R. Having done that, 
we interpret as follows: a = °a, f* =° f, and P™ = °P. We will call 
the language so obtained, simply, L. Due to lack of imagination, we will 


“60 


call the expansion of the original still %.t Thus 
R=(R,...,Ca:°a eR), Cf:°fonR),CP:°PonR)) (i) 


(2) Not all functions on R are total (i.e., everywhere defined). For example, the 
function x +> ./x is undefined for x < 0. This creates a minor annoyance 
because functions of a first order language are supposed to be interpreted as 
(i.e., are supposed to name) total functions. We get around this by simply 
not importing (names for) nontotal functions. However, we do import their 
graphs,’ for these are just relations. All we have to do is to invent appropriate 
notation for the formal name of such graphs. For example, suppose we want 
to formally name “y = °./x”. We justuse y = ./x. Thus, the formal (binary) 
predicate name is the compound symbol “= ./_”, which we employ in infix 
notation when building an atomic formula from it. See also 1.6.37 later in 
this section (p. 101). 


With the above out of the way, we fix a cardinal m that is bigger than the 
maximum of the cardinal of L and R. By the upward Lowenheim-Skolem 


— 


There is nothing odd about this. Both “4/9” and 19/0 /9” are names. Unlike what we do in normal 
use, we pretend here that the former is the formal name, but the latter is the informal name for 
the object called the “square root of two”. 

No confusion should arise from this, for neither the original bare bones ‘¥ nor its language — 
which we haven’t named — will be of any future use. 

The graph of an n-ary function ° f is the relation {(y, X,) : y = ° f (Xn) is true in R}, or simply, 


bol ° f Gn). 


as 


wo 
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theorem there is a structure 
*"R=(CR,...,Ca:°a eR), Cf: °fonR),C@P:°PonR)) (ii) 


for L such that 


R<*R (1) 
and 
(cardinality of *R) = m (2) 
In particular (from (1) and (2)), 
Rc*R (3) 


1.6.33 Definition. The members of *R are called the hyperreals or hyperreal 
numbers. By (3), all reals are also hyperreals. The members of *IR — R —a non- 
empty set by (3) — are the non-standard numbers or non-standard hyperreals. 
The members of R are the standard numbers or standard hyperreals. 


(1) says that a first order sentence is true in % iff it is true in *R. This is neat. It 
gives us a practical transfer principle (as it is called in Keisler (1982), although 
in a different formulation): To verify a first order sentence in *K (respectively, 
in 8) one only needs to verify it in KW (respectively, in *S). 

By (1) (which implies % C *¥) we also have 


°a="“a forall “aeR (4) 
pie f forall °fonR (5) 

and 
°PC*P forall °PonR (6) 


1.6.34 Example. Here is an example of how (4) and (6) apply: 
(Vx)(x <0 A@y)(y = Vx)) (7) 


is true in . Hence it is true in *S8. Moreover, (—6) < 0 is true in *K, since it 
is true in KR. 


Pause. We have used formal notation in writing the two previous formulas. 
By (7) and specialization, (—6) < 0 > 7=(dy)(y = /(—6)). Thus, 


“yy = v(-6)) 
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is true in *$%. Concretely this says that if we are working in the set *R, then 
(using x +> *,/x for the concrete function on *R, and also using (4)) 


*\/°(—6) is undefined 


So, nontotal functions continue (i.e., their extensions on *R do) being undefined 
on the same reals on which they failed to be defined on R. 

The reader can similarly verify that *R will not forgive division by zero, 
using, for example, the formal sentence —(Ay)(y - 0 = 1). 


1.6.35 Example. Now start with the sentence (written formally!) sin(z) = 0. 
This is true in %, and hence in *%. A more messy way to say this is to say that by 
(5) (note that ° sin is total; hence the name sin was imported), ° sin C * sin, and 
hence (cf. (4)) * sin(*z) = * sin(°z) = ° sin(°77) = °0 = *0. Thus, sin(z) = 0, 
the formal counterpart of * sin(*7) = *0, is true in *. 


Elementary arithmetic on R carries over on *R without change. Below is a 
representative sample. 


1.6.36 Example. First off, * < is a total order on *R. That is, the following 
three sentences are true in “SR (because they are so in SR): 


(Vx)(—x < x) 
(Vx)(Vy)(x<yVy<xVx=y) 
and 
(Vx)\(Vy)(Vz)(x<yAy<z>x <2) 
The formal symbol <, as usual, is introduced by a definition 
xsyoxuHyVvx<y 
Can we add inequalities, term by term, in *IR? Of course, since 
VWxAVyVAIVwx <yAz<woxt+z<ytw) 


is true in Sf therefore in *% as well. 
The above has all sorts of obvious variations using, selectively, < instead of 


<. Multiplication, term by term, goes through in *S% under the usual caveat: 


(Vx)\Vy)VzaVWw)0<x<yA0<z<w->xz< yw) 
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Two more useful inequalities are 
(Vx)(Vy)Vz)(z <OAx <y—> zx > zy) 
and 
(WWx)(Vy)(O<xAx <Il/y)o(O<yAy <1/x)) 


The formal function |...| (absolute value) enjoys the usual properties in ** 
because it does so in . Here is a useful sample: 


(Wx)(Vy (lx + yl S |x] + ly) 
(Wx)(Vy (lx — yl S |x] + ly) 


(Wx)Vy\\x]< ya -ysx<y) 


Back to equalities: + and x are, of course, commutative on*R, for (Vx)(Wy)(x+ 
y = y+ xx) is true in K. Moreover, 


(Vx)Vy xy] = lxily) 


where we have used “implied x” above. Finally, the following are true in *f: 
(Vx)(Ox = 0), (Wx)(1x = x), and (Vx)(—1)x = —x). 


1.6.37 Remark (Importing Functions: the Last Word). We can now conclude 
@ our discussion, on p. 98, about importing nontotal functions. Let then ° f be an 

n-ary nontotal function on R. The graph, y = ° f(X,) is imported as the formal 
predicate “(= f)” — we write, formally, “y(= f)x,” rather than “y = fx,” or 
“y = f(x,)”.' Let y = * f (X,) be the extension of y = ° f(x,) on *R (cf. (6), 
p. 99). We note two things: 

One, for any b, a, in R, if b = °f(a,), then b = * f(a,) by (6) and (4) 
(p. 99). 

Two, since 


(Wri). Wan Y WWW Akin A (= f)Xn > Y = 2) 


is true in RW by the assumption that °f is a function — i.e., that the relation 
y = °f (Xp) is single-valued in y — it is true in *R as well, so that the concrete 
relation y = * f (X,) is single-valued in y. That is, * f is a function. Along with 
remark one, above, this yields ° f C * f. This is the counterpart of (5) (p. 99) 


+ This avoids a possible misunderstanding that, in something like y = fX,, f is ann-ary function 
letter and hence fx, is aterm. f is part of a language symbol, namely, of the predicate “(= f)”. 
It is not a symbol itself. 
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for nontotal functions, and we are pleased to note that it has exactly the form 
of (5), without any caveats. 


Pause. Is * f total on *R? 


We can now pretend, in practice (i.e., in argot) that all functions of R, total 
or not, have been imported and thus have extensions in *R. 


We next explore * looking for strange numbers, such as infinitesimals and 
infinite numbers. To ensure that we know what we are looking for, we define: 


1.6.38 Definition (Infinitesimals). An infinitesimal h is a member of *R such 
that |h| < x is true in * for all positive x ¢ R. A member h of *R is a finite 
hyperreal iff |h| < x is true in * for some positive x € R. A member h of *R 
is an infinite hyperreal iff it is not finite. That is, |h| > x is true in * for all 
positive x € R. 


1.6.39 Remark. Thus, 0 is an infinitesimal, and every infinitesimal is finite. 

The reader has surely noticed that we have dropped the annoying left- 
superscripting of members of *R (or of R). This is partly because of (4) (p. 99). 
Since °a = *a for all °a € R, it is smart to name both by the simpler formal 
name, a. Moreover, since the left-superscript notation originates in the structure 
SK, it is not applicable to objects of *9% unless these are extensions (functions 
or predicates) of objects of . It is thus pointless to write “*h”. 

Even in the cases of functions and predicates, we can usually get away 
without superscripts, letting the context indicate whether we are in R, in *R, or 
in the formal language. For example we have used “|h|” rather than “*|h|”, but 
then we are clear that it is the latter that we are talking about, as the absolute 
value here is applicable to non-standard inputs. 


1.6.40 Proposition. 1 € *R is a nonzero infinitesimal iff 1/\h| is infinite. 


Proof. If part: Suppose that 1/|h| is infinite, and let 0 < r € R be arbitrary. 
Then 1/r < 1/|h|. Specialization of the appropriate inequality in 1.6.36 (a true 
fact in *) yields |h| <r. 

Only-if part: Let 0 < r € R be arbitrary. By hypothesis, |h| < 1/r. By the 
fact invoked above, r < 1/|h|. 


1.6.41 Proposition. Let h and h' be infinitesimals. Then so are 
() hth’, 


(2) hh’, 
(3) hr foranyr € R. 
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Proof. (1): Let 0 < r € R be arbitrary. Then [A +h’| < |A|+|A’| < r/24+ 
r/2 <r is true in*K. 

(2): Let 0 < r € R be arbitrary. Then |hh’| = |h||h'| < rl =r is true in 
*R. 

(3): Ifr = 0, then h0 = 0 (cf. 1.6.36), an infinitesimal. Otherwise, letO < s € 
R be arbitrary. |h| < s/|r| by hypothesis; hence |Ar| = |h||r| < s (the reader 
can easily verify that we have used legitimate *R-arithmetic). 


1.6.42 Proposition. Let h and h' be infinite hyperreals. Then so are 


(1) hh’, 
(2) hr for anyO ¢r ER, 
(3) h+r for anyr € R. 


@.* prudence to ask that 0 # r in (2) stems from the concluding remarks 


in 1.6.36. © 


Proof. (1): Let 0 <r € R be arbitrary. Then |Ah’| = |A||h'| > r1 = r is true 
in *R. 

(2): Let 0 < s € R be arbitrary. |h| > s/|r| by hypothesis; hence |hr| = 
|A||r| > s. 

(3): Let 0 < s € R. Now, |h + r| > |h| — Ir]. 


Pause. Do you believe this? 


Hence, s < |h + r|, since s + |r| < |Al. 


1.6.43 Example. We observe the phenomenon of indeterminate forms familiar 
from elementary calculus. There we use the following symbols: 


(i) Form oo — oo. This translates into “there is no fixed rule for what h —h’ will 
yield” (both positive infinite). For example, if h = 2h’ (certainly infinite, 
by (2)), then  — h’ is infinite. If a is an infinitesimal, then h’ + a is infinite. 


Pause. Is that true? 


Hence, if h = h' + a, then h — h’ = a is an infinitesimal. In particular, it 
is finite. 

(ii) Form oo/oo. There are three different outcomes for h/h’, according as 
h=h',h =(h'y (see (1) above), or h! = h?. 

(iii) Form 0/0. This translates to the question “‘what is the result of h/h’, if 
both are infinitesimal?” It depends: Typical cases are h = h', h = (h’)’, 
and h’ =h?. 
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The following terminology and notation are useful. They are at the heart of 
the limiting processes. 


1.6.44 Definition. We say that two hyperreals a and b are infinitely close, in 
symbols a ~ b, iff a — b is an infinitesimal. 

Thus, in particular (since a — 0 = a is true), a © O says that a is an infinites- 
imal. 


1.6.45 Proposition. ~ is an equivalence relation on*R. That is, for all x, y, z, 
it satisfies 


(1) x x, 
Q)xxyom yx, 
3B) xX yXz>oxez, 


Proof. Exercise 1.59. 


1.6.46 Proposition. [fr ~ s andr and s are real, thenr = s. 


Proof. r—s © Oandr —s € R. But, trivially, 0 is the only real infinitesimal. 


1.6.47 Theorem (Main Theorem). For any finite non-standard number h there 
is a unique realr such thath~r. 


& Throughout the following proof we use superscriptless notation. In each case 
it is clear where we are: In L, in M, or in *RK. & 


Proof. Uniqueness is by 1.6.45 and 1.6.46. For the existence part let |h| < b, 
0 < be R, and define 


H={xeR:x<h} (1) 


H, asubset of R, is bounded above by b: Indeed, in*R, and for any fixed x € H, 
x <h < b holds. Hence (cf. 1.6.36), still in *R, x < b holds. Then it is true 
in R as well (cf. p. 99, (4) and (6)). Let r € R be the least upper bound of H 
(over the reals, least upper bounds of sets bounded above exist). We now argue 
that 


her (2) 
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Suppose that (2) is false. Then there is a s > 0, in R, such that 
s <|h—r|istruein*R (3) 
There are two cases: 
Case |h —r| =h—r. Then (3) implies s +r <h. Since s +r is standard, 


but A is not, we haves +r < hhences+r ce H,andthuss+r <r from 
the choice of r (upper bound), that is, s < 0, contrary to the choice of s. 
Case |h — r| =r —h. Then (3) implies h < r — s. Thus r — s is an upper 
bound of H (in R), contrary to choice of r (least upper bound, yet r — 
S <I). 


© It is unreasonable to expect that an infinite number / is infinitely close to a real 
r, for if that were possible, then hh —r = a © 0. Butr + a is finite (right?). & 


1.6.48 Definition (Standard Parts). Let h be a finite hyperreal. The unique 
real r such that h © r is called the standard part of h. We write st(h) = r. 


on Example. For any real r, st(r) = r. This is by r © r and uniqueness 
of standard parts. Also, since fA is an infinitesimal iff h ~ 0, then / is an 
infinitesimal iff st(h) = 0. ® 


We can now prove that infinitesimals and infinite numbers exist in great 
abundance. 


1.6.50 Theorem. There is at least one, and hence there are uncountably many, 
non-standard infinitesimals. 


Proof. Pick anh € *R—R (cf. (3) preceding Definition 1.6.33). If it is infinite, 
then 1/|A| is an infinitesimal (by 1.6.40). It is also nonzero, and hence non- 
standard (0 is the only standard infinitesimal). 

Pause. —=(4x)(1 = x0) is true in . 

If |h| is finite, then st(h) exists and a = h— st(h) + 0. Ifa € R, soish = 
st(h) + a. 

Why uncountably many? Well, fix a non-standard infinitesimal h. The func- 
tion r +> rh is a 1-1, total function on the reals. Thus, its range, {rh :r € R} 
is in 1-1 correspondence with R, and hence is uncountable. 


+ Letus settle an inconsistency in terminology: We use “true in R” (with or without the superscript) 
synonymously with “true in 9” 

= 1-1 because rh = rh > r = r', since h # O — being non-standard — and the sentence 
Wx)\(Vy)\(Vz(z 40> xz = yz > x = y) is true in KR. 
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1.6.51 Corollary. There is at least one, and hence there are uncountably many, 
infinite numbers. 


1.6.52 Remark. By the preceding results, every finite hyperreal / has the form 

@ st(h) + a, where a © 0. a = O iff h € R. For such hyperreals, h ~ st(h). 
Conversely, any pair r (real) and a ~* 0 leads to a hyperreal h = r + a such 
that st(h) = r (sincer +a &1r). 

Thus, if we were to depict the set *R on a line, in analogy with the R-line, 
we could start with the latter, then stretch it and insert nonstandard numbers 
(respecting order, *<) so that each real r lies in a “cloud” of nonstandard 
numbers that are infinitely close to r. Then each such cloud! contains only one 
real; forr ~ h © r’ withr,r’ real implies r © r’ and hence r = r’. The cloud 
in which 0 lies is the set of all infinitesimals. 

We then add all the positive infinite numbers to the right end of the line 
(again, respecting order, *<) and all the negative infinite numbers to the left 
end of the line. 


The definition that a function f is continuous that we will eventually give 
essentially requires that the standard part function st, commute with f. As a 
prelude towards that we present a proposition below that deals with the special 
cases of addition, multiplication, and a few other elementary functions. 

But first, the non-standard counterpart to the pinching lemma of calculus.+ 


1.6.53 Lemma (Pinching Lemma). /f0 < h < h' and h’' © 0, thenh ~ 0. 
Either or both of “<” can be replaced by “<”. 


Proof. Exercise 1.60. 


1.6.54 Corollary. Ifa < b <canda~c, thena © band b & c. Moreover, 
this remains true if we replace one or both of “<” by “<”. 


Proof. 0<b-—a<c-—a. 


1.6.55 Proposition (Elementary Algebra of st). Throughout, a and b are fixed 
finite hyperreals. 


(1) a => Oimplies st(a) = 0. 
(2) st(a + b) = st(a) + st(b). 
(3) st(a — b) = st(a) — stb). 


} The cloud forr € R is {r+a:a~® 0}. Of course, ifa < 0, thenr +a <r, while if a > 0, then 
r+a>r. 

t T£0 < g(x) < f(x) for all x in an open interval (a, b), c € (a, b), and lim,_,, f(x) = 0, then 
also lim,-.¢ g(x) = 0. 
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(4) st(a - b) = st(a)- st(d). 

(5) If st(b) 4 0, then st(a/b) = st(a)/st(b). 

(6) st(a”) = st(a)” for alln EN. 

(7) If st(a) £ 0, then st(a~") = st(a)~” for allO <n EN. 

(8) st(a!/") = st(a)'/” for all0 <n € N(a > O assumed for n even). 


Proof. We sample a few cases and leave the rest as an exercise (Exercise I.61). 

(1): Assume the hypothesis, yet st(a) < 0. By 1.6.54, st(a) © 0; hence st(a) = 
0, a contradiction. 

(5): st(b) 4 0 implies that b % 0,* in particular, b ~ 0. The formula to 
prove thus makes sense. Now, a = a(a/b); hence, by (4), st(a) = st(a(a/b)) = 
st(a) st(a/b). 

(7): Having proved (6) by induction on n, we note that st(a”) = st(a)" 4 0. 
Moreover, a~” = 1/a". Hence st(a~") = st(1/a”) = st(1)/st(a”) = 1/st(a)", 
since st(1) = 1. 

(8): For odd n, it is true in SK that 

(Vx )(Ay)x = y" 
or, more colloquially, 
(vx)@y)y = x1" (i) 
() is also true in *SK, so it makes sense, for any a € *R and odd n, to form qiln, 
Similarly, if n is even, then 
(Vx)(x > 0 > (yyy = x!/") (ii) 
is true in *, so it makes sense, for any 0 < a € *R and even n, to form a!/". 

For any such a!/” that makes sense, so does st(a)!/”, by (1). 

Thus, noting that a = (a'/")" we get st(a) = st((a!/")") = st(a!/")", the 
second “=” by (6). Hence st(a!/”) = st(a)!/". 


A corollary to (1) above, insignificant (and easy) as it may sound, is the 
non-standard counterpart to the statement that a real closed interval [a, b] is 
compact, that is, every sequence of members of [a, b] that converges, converges 
to a number in [a, D]. 


1.6.56 Corollary. h < h’ implies st(h) < st(h’). In particular, if a and b are 
real and the closed interval in *R is 


[a,b] @ (x €*Ria<x <b} 
then whenever h € [a, b], it also follows that st(h) € [a, b]. 


i % 0 means, of course, =(b © 0). 
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The notion of the limit of a real function’ of one variable f at a point a 
depends on what the function does when its inputs are in a neighbourhood of 
a — that is, an open interval (c, b) that contains a — but not on what it does on a 
itself. For this reason the limit is defined in terms of a punctured neighbourhood 
of a, which means a set like (c, a) U(a, b), where c < a < b. We are interested 
in calculating better and better approximations to the values f(x) as x gets very 
close to a (but never becomes equal to a — for all we care, f might not even be 
defined on a). 

We can define limits a /a Leibniz now, replacing “very close” by “infinitely 
close”: 


1.6.57 Definition (Limits). Let ° f be a real function of one variable defined 
in some punctured real neighbourhood of the real a. Let b be also real.! The 
notation lim,_,, ° f(x) = b abbreviates (1.e., is defined to mean) 


for all non-standard h ~ 0, *“flath) sb (1) 


In practice (argot) we will let the context fend for itself and simply write (1) as 
“for all non-standard h ~ 0, f(a +h) © b” and, similarly, lim, f(x) = b 
for its abbreviation, that is, dropping the « and 0 left superscripts. We have just 
defined the so-called two-sided finite limit. 


Similarly one defines a whole variety of other limits. We give two examples 
of such definitions (in simplified notation): 


Suppose f is defined in the open interval (a, c) and let b be real. Then the 
symbol lim,-.,+ f(*) = b abbreviates 


for all positive h © 0, flath)x~b (2) 
We have just defined the so-called right finite limit. 


Finally, let f be defined in the open interval (a, c). Then lim,_,,+ f(x) = 
+oo abbreviates 


for all positive h ~ 0, f(a +h) is positive infinite (3) 


We have just defined the so-called right positive infinite limit. 


1.6.58 Remark. 


(A) In (1) in the above definition, h ¢ *R — R guarantees that h 4 0. This in 
turn guarantees that we are unconcerned with what f wants to be on a — as 


+ That is, a function f : R > R —not necessarily a total one. 
* Recall that, because °a =* a by (4) on p. 99, we have already decided to use the formal name 
“a” for either °a or *a. 
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we should be — since a + h 4 a. Furthermore, * f(a + h) is defined 
for all such h, so that (1) makes sense. This is easy: To fix ideas, let 
° f be defined in the punctured neighbourhood (d, a) U (a, c). First off, 
0 < |h| < min(a —d,c— a), since0 4h * 0. Henceh > 0 >a < 
a+h <c,whileh <0 > d<a+h < a. Secondly, the sentence 
(Vx)(d <x <aVa<x<c— (y)y = f(%)) is true in ® by 
the assumption on ° f, hence also in *%. Thus, * f is defined in the non- 
standard (punctured) neighbourhood {x € *R:d<x<aVa<x <c}. 


This neighbourhood we are going to denote by (d, a) U (a, c) as well, and 
let the context indicate whether or not this symbol denotes a standard or a 
non-standard neighbourhood. 


(B) Another way to state (1) above is 


for all nonzero h ~ 0, st(f(a+h))=b (4) 


That is, the standard part above is independent of the choice of h. 


1.6.59 Example. We compute lim,_,(3x* —x?+1): LetO ¢ h © Obe arbitrary. 
Then 


st(3(2 + hy — (2 +h)? + 1) = st(3(2? + 12h + 6h? +h?) 
—(4+ 4h +h’) +1) 
= st(21 + 32h + 17h? + 3h?) 
=e by 1.6.55 


We compute lim,_,9(x/|x|). We want st(i/|h|) for 0 4 h ~ 0. In order to 
remove the absolute value sign we consider cases: 


(7) ={! ifh >0 
|h| -l ifh<0O 
According to the definition (cf. (B), previous remark), the limit lim,_,9 
(x/|x|) does not exist. The calculation above shows however that lim,_,9+(x/ 
|x|) = 1 and lim,_,9-(x/|x|) = —1. 

We see that calculating limits within non-standard calculus is easy be- 
cause we calculate with equalities, rather than with inequalities as in standard 
calculus. 


We show next that the non-standard definition of limit is equivalent to 
Weierstrass’s ¢-6 definition: 
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1.6.60 Theorem. Let ° f be a real function, defined in some punctured neigh- 
bourhood of (the real) a. Let b be also real. The following statements are 
equivalent: 


(i) (1) in Definition 1.6.57 
(ii) (WO<e € R\(0 <6 € R\Wx € RO <|x -—a| <b> |° f(x) —-D| < €).7 


Proof. (i) > (ii): Let then (1) in 1.6.57 hold. To prove (ii), fix an0 < ¢ € R# 
It now suffices to show the truth of the formal sentence (iii) below: 


(45 > 0)(Vx)(0 < |x —a| <6 > | f(x) —b| <8) (iii) 


One way to prove an existential sentence such as (iii) is to exhibit a 6 that 
works. Since it is a sentence over L, it suffices to verify it in *. It will then be 
true in 8% — which is what we want. 

Thus, we take 6 = h where 0 4 h © O and show that it works. Let now 
x € *R be arbitrary such that 0 < |x —a| < h. Thus (1.6.53), |x — a| * 0; 
hence x — a © 0 from —|x —a| < x —a < |x — a| and 1.6.54 (via 1.6.55). We 
can now write x = a-+h’, withh’ + 0 andh' + 0. By hypothesis — i.e., (i) — 
*f(a th’) & b is true in *M, ie., * f(x) © b is true. Hence * f(x) — b & 0, 
and therefore |* f(x) — b| is less than any positive real. In particular, it is less 
than ¢. 


(ii) — (i): We assume (ii), pick an arbitrary h such that 0 4 h * O, and 
prove * f(a +h) © b. This requires 


for all real ¢ > 0, *fiath)—b| <e (iv) 


So fix an arbitrary real e > 0. Assumption (i7) translates into the assumption 
that the sentence (iii) is true in SK. Let then 6 > 0 be real, so that the L-sentence 
below is true in S% (6 and ¢ below are formal constants): 


(Vx)(0 < |x —a| <6 > | f(x) — b| < €) (v) 


(v) is also true in “SR. By specialization in the metalanguage, take x = a +h. 
Now, 0 < |x — a| = |h| by choice of h. Also, |x — a| = |h| < 6 is also true, 
since 6 > 0 and real, and h ~ 0. Thus, by (v), translated into *#R, we have 
* f(x) — b| < e. This proves (iv). 


This argot is a bit awkward, but not unusual. “(VO < e¢ € R)...” stands for “(We)(0 < eAE € 
R-...”. 
We have fixed a real ¢. Recall that the name “e” is also used for the formal constant that denotes 
this real e. 
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Worth repeating: The part (i) — (ii) of the above proof was an instance where 
we were able to prove a first order fact in *S% and then transfer it back to KR. 
This is the essential use we get from the elementary extension SR ~ *M, for if 
all the facts we needed could easily be proved in , the whole fuss of obtaining 
an extension that contains weird numbers would be pointless. S 


We conclude with the definition of continuity and with one more elementary 
application of transferring facts from *S% back to %. Some more techniques and 
facts will be discovered by the reader in the Exercises section. 


1.6.61 Definition. Let f be a real function of one real variable, defined at least 
on an open real interval (a, b). We say that f is continuous at c € (a, b) (areal 
point) iff lim,.. f(x) = f(c). 

If f is also defined at a, then we say that it is continuous at the left endpoint 
a of [a, b), meaning that lim,_,,+ f(x) = f(a). In a similar situation at the 
right endpoint b, we require that lim,_,,- f(x) = f(b). 

We say that f is continuous on [a, b] iff it is so at every real x € [a, b]. 


1.6.62 Remark. The above is the standard definition. Since it involves the 

© concept of limit, we may translate it to a corresponding non-standard definition. 
Let then f be defined on the real closed interval [a, b]. Then for any (real) 
c € (a, b) continuity requires (using / as a free variable over *R) 


0OFAhAXO> st f(c+h)) = fC) (1) 
Continuity at the endpoints reads 

0<hx0> st" flath)) = f@ (2) 
and 

O>he0-> stC fb+h)) = fb) (3) 


Does it matter if we take the 0 ¥ part away? No, since the limit is equal to the 
function value. 

Suppose now that f is continuous on the real interval [a, b]. We now extend 
[a, b] to include non-standard numbers as in I.6.56. Then, whenever x € [a, b], 
where x is a hyperreal, we also have st(x) € [a, b] by 1.6.56. Thus, x € [a, b] 
implies that x = r +h where rr is real—a <r < b—andh © 0. We can now 
capture (1)—(3) by the single statement 


x € [a,b] > st(*f(x)) = * f(st@)) (4) 
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Thus, continuity is the state of affairs where st commutes with the function 
letter. By the way, since st(x) is real, so is * f(st(x)); indeed, it is the same as 
° f (st(x)) (cf. (5), p. 99), which we write, more simply, f(st(x)). In practice one 
writes (4) above as 


x € [a,b] > st(f(x)) = f(st(x)) 


1.6.63 Example. The function x +> ./x is continuous on any [a, b] where 
0 < a. Indeed, by 1.6.55, 0 < x implies st(./x) = ./st(x). Now invoke (4) 
in 1.6.62. 


1.6.64 Theorem. Suppose that f is continuous on the real interval [a, b]. Then 
f is bounded on [a, b], that is, there is a real B > 0 such that 


x € [a,b]NR > |f(x)| < B (1) 


Proof. We translate the theorem conclusion into a sentence over L. “f”, as 
usual, plays a dual role: name of the real and name of the formal object. The 
translation is 


(Ay)\(Wx)a <x <b> |f@)| <y) ()) 


Now (1’) is true in “$8 under the assumption that ° f, which we still call f, is 
continuous. 

Here is why: Take y = H, where H € *Ris some positive infinite hyperreal. 
Pick any hyperreal x in [a, b] (extended interval). Now, the assumption on 
continuity, in the form (4) of 1.6.62, has the side effect that 


st( f (x)) is defined 


Hence f(x) is finite. Let then0 < 7, € Rsuch that | f(x)| < r,.Thisr, depends 
on the picked x. But r, < H; thus | f(x)| < A for the arbitrary hyperreal x in 
[a, b], establishing the truth of (1’) in *M. So it is true in ® too. 


1.7. Defined Symbols 


We have already mentioned that the language lives, and it is being constantly 
enriched by new nonlogical symbols through definitions. The reason we do this 
is to abbreviate undecipherably long formal texts, thus making them humanly 
understandable. 

There are three possible kinds of formal abbreviations, namely, abbreviations 
of formulas, abbreviations of variable terms (i.e., “objects” that depend on free 


<4 
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variables), and abbreviations of constant terms (i.e., “objects” that do not depend 
on free variables). Correspondingly, we introduce a new nonlogical symbol for 
a predicate, a function, or a constant in order to accomplish such abbreviations. 
Here are three simple examples, representative of each case. 
We introduce a new predicate (symbol), “C”’, in set theory by a definition’ 


ACBs(VWx)xe€A>xeB) 


An introduction of a function symbol by definition is familiar from elemen- 
tary mathematics. There is a theorem that says 


“for every non-negative real number x there is a unique 
non-negative real number y such that x = y- y” (1) 


This justifies the introduction of a 1-ary function symbol / that, for each such x, 
produces the corresponding y. Instead of using the generic “ f(x)”, we normally 
adopt one of the notations “./x” or “x!/?”. Thus, we enrich the language (of, 
say, algebra or real analysis) by the function symbol ./” and add as an axiom 
the definition of its behaviour. This would be 


x= VxJx 


or 
yeVxoxu=y-y 


where the restriction x > 0 is implied by the context. 

The “enabling formula” (1) — stated in argot above — is crucial in order 
that we be allowed to introduce ./ and its defining axiom. That is, before we 
introduce an abbreviation of a (variable or constant) term —i.e., an object — we 
must have a proof in our theory of an existential formula, i.e., one of the type 
(Aly). 4, that asserts that (if applicable, for each value of the free variables) a 
unique such object exists. 


The symbol “(A! y)” is read “there is a unique y”. It is a “logical” abbreviation 
(defined logical symbol, just like V) given (in least-parenthesized form) by 


(Ax). 4 A A(z). 4 A mx = Z)) © 


Finally, an example of introducing a new constant symbol, from set theory, 
is the introduction of the symbol @ into the language, as the name of the unique 


i In practice we state the above definition in argot, probably as “A C B means that, for all x, 
xeA>xeB”. 
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object' y that satisfies ~U(y) A (V¥x)x € y, read “y is a set, and it has no 
members”. Thus, J is defined by 


AUB) A (Wx)x €B 
or, equivalently, by 
y=b<o Uy) A (Wx)x € y 


The general situation is this: We start with a theory I’, spoken in some 
basic’ formal language L. As the development of I" proceeds, gradually and 
continuously we extend L into languages L,,, for n > 0 (we have set Lo = L). 
Thus the symbol L,,,, stands for some arbitrary extension of L,, effected at 
stage n + 1. The theory itself is being extended by stages, as a sequence I’, 
n> 0. 

A stage is marked by the event of introducing a single new symbol into the 
language via a definition of a new predicate, function or constant symbol. At 
that same stage we also add to I, the defining nonlogical axiom of the new 
symbol in question, thus extending the theory I’, into ',41. We set 9 = TL. 


Specifically, if§ O(X,) is some formula, we then can introduce a new pred- 
icate symbol “P’* that stands for 7. 


In the present description, @ is a syntactic (meta-)variable, while P is a new 
formal predicate symbol. 


This entails adding P to L, (i.e., to its alphabet 7,) as a new n-ary predicate 
symbol, and adding 


PXn <> OGn) (i) 


to I’, as the defining axiom for P. “C” is such a defined (2-ary) predicate in set 
theory. 

Similarly, anew n-ary function symbol f is added into L; (to form L;z+1) by 
a definition of its behaviour. That is, we add f to Ly and also add the following 


— 


Uniqueness follows from extensionality, while existence follows from separation. These facts — 
and the italicized terminology — are found in volume 2, Chapter III. 

U is 1-ary (unary) predicate. It is one of the two primitive nonlogical symbols of formal set 
theory. With the help of this predicate we can “test” an object for set or atom status. “ U(y)” 
asserts that y is an atom, thus “-U(y)” asserts that y is a set — since we accept that sets or atoms 
are the only types of objects that the formal system axiomatically characterizes. 

“Basic” means here the language given originally, before any new symbols were added. 

Recall that (see Remark I.1.11, p. 18) the notation 7 (x,,) asserts that X,, ic., X1,..., Xp, is the 
complete list of the free variables of @. 

Recall that predicate letters are denoted by non-calligraphic capital letters P, Q, R with or without 
subscripts or primes. 


+ 


= wm 


+ 


4 
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formula (ii) to 'y, as a new nonlogical axiom: 
y= fYi--- Yn <> GY, Vy +++ Yn) (ii) 


provided we have a proof in I’, of the formula 
ALY)O Os Vis +++ 5 Yn) (iii) 
© Of course, the variables y, y,, are distinct. 


Depending on the theory, and the number of free variables (n > 0), “ f” may 
take theory-specific names such as , w, / , etc. (in this illustration, for the 
sake of economy of effort, we have thought of defined constants, e.g., 0 and w, 
as 0-ary functions — something that we do not normally do). 


In effecting these definitions, we want to be assured of two things: 


1. Whatever we can state in the richer language L;, (for any k > 0) we can also 
state in the original (“‘basic’’) language L = Lo (although awkwardly, which 
justifies our doing all this). “Can state” means that we can “translate” any 
formula.¥ over L; (hopefully in a natural way) into a formula .¥ * over L 
so that the extended theory I’, can prove that.Y and.¥ * are equivalent.' 

2. We also want to be assured that the new symbols offer no more than con- 
venience, in the sense that any formula .7, over the basic language L, that 
Tl, (k > 0) is able to prove, one way or another (perhaps with the help of 
defined symbols), I can also prove.t 


These assurances will become available shortly, as Metatheorems I.7.1 and 1.7.3. 
Here are the “natural” translation rules, that take us from a language stage L441 
back to the previous, L, (so that, iterating the process, we are back to L): 


Rule (1). Suppose that .Y is a formula over L;z+1, and that the predicate P 
(whose definition took us from L; to Ly1, and hence is a symbol of Lz4; but 
not of L;) occurs in.Y zero or more times. Assume that P has been defined by 
the axiom (i) above (included in ',41), where @ is a formula over Lx. 

We eliminate P from .¥ by replacing all its occurrences by @. By this we 
mean that whenever P?, is a subformula of .¥, all its occurrences are replaced 
by O(t,). We can always arrange by 1.4.13 that the simultaneous substitution 
Ol\%n < ty] is defined. 


This results to a formula.¥ * over Ly. 


+ T, spoken over L, can have no opinion, of course, since it cannot see the new symbols, nor does 
it have their “definitions” among its “knowledge”. 

= Trivially, any 7 over L that can prove, any Fy (k > 0) can prove as well, since the latter 
understands the language (L) and contains all the axioms of I’. Thus I’, extends the theory I. 
That it cannot have more theorems over L than T makes this extension conservative. 
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Rule (2). If f is a defined n-ary function symbol as in (ii) above, introduced 
into Ly4,, and if it occurs in.Y as.Y [ ft, ...t,],' then this formula is logically 
equivalent to! 


(yy = ft... AF ly) (iv) 


provided that y is not free in. YF [ ft, ... ty]. 
Using the definition of f given by (ii) and 1.4.13 to ensure that (y, f,) is 
defined, we eliminate this occurrence of f, writing (iv) as 


(Ay\(C(y, th, .-.5t) AF [y])) (v) 


which says the same thing as (iv) in any theory that thinks that (77) is true (this 
observation is made precise in the proof of Metatheorem I.7.1). Of course, f 
may occur many times in .¥, even “within itself”, as in ffz1...Zny2---Yns! 
or even in more complicated configurations. Indeed, it may occur within the 
scope of a quantifier. So the rule becomes: Apply the transformation taking 
every atomic subformula. A[ f1,,] of F into form (v) by stages, eliminating at 
each stage the leftmost innermost! occurrence of f (in the atomic formula we 
are transforming at this stage), until all occurrences of f are eliminated. 
We now have a formula .¥ * over Ly. 


1.7.1 Metatheorem (Elimination of Defined Symbols: I. Let I be any theory 
over some formal language L. 


(a) Let the formula @ be over L, and P be anew predicate symbol that extends 
L into L' andT into TY’ via the axiom PX, <> O(X,). Then, for any formula 
F over L', the P-elimination as in Rule (1) above yields a.F* over L such 
that 


Vb F<oF* 


(b) Let.F [x] be over L, and let t stand for ft, ...,t,, where f is introduced 
by (ii) above as an axiom that extends T into V'. Assume that no t; contains 
the letter f and that y is not free in .F [t]. Then* 


Ib F(t] > Ay, tr) AF LD 


+ 


This notation allows for the possibility that ft,,...,f, does not occur at all in .Y (see the 
convention on brackets, p. 18). 


See (C) in the proof of Metatheorem I.7.1 below. 

8 Or Sf C1, +++ Zn)s ¥2,+++5 Yn)), using brackets and commas to facilitate reading. 
4 A term fti,...,t is “innermost” iff none of the ¢; contains “f”. 

# 


As we already have remarked, in view of 1.4.13, it is unnecessary pedantry to make assumptions 
on substitutability explicit. 
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® Here “L’” is “Ly,” (for some k) and “L” is “L;”. ee 
Proof. First observe that this metatheorem indeed gives the assurance that, after 


applying the transformations (1) and (2) to obtain. 7 * from .¥, I’ thinks that 
the two are equivalent. 


(a): This follows immediately from the Leibniz rule (1.4.25). 
(b): Start with 


FF [t]>t=ta.F¥[t] (By t =t and Eqaut-implication) (A) 
Now, by Ax2, substitutability, and non-freedom of y in.¥ [f], 
Fr=tAF{[t]> @yQ =ta Fh) 
Hence 
+ F[t]> AVG =ta Fb) (B) 


by (A) and — qaut-implication.t 
Conversely, 


Fy=t—> (¥l[y] < F[t)) (Ax4; substitutability was used here) 
Hence (by taut) 
Fy=tAF¥[y]> Flt] 

Therefore, by 4-introduction (allowed, by our assumption on y), 

F Ay(y =ta Fly) > Fit] 
which, along with (B), establishes 

FF [t] > (yy =ta.F ly) (C) 
Finally, by (ii) (which introduces I’ to the left of +), (C), and the Leibniz rule, 

IE Ft] > ACO, th) AF LD (D) 


The import of Metatheorem I.7.1 is that if we transform a formula.¥ — written 
© over some arbitrary extension by definitions, L;,,, of the basic language L — 
into a formula.¥ * over L, then I',4 (the theory over L;,, that has the benefit 
of all the added axioms) thinks that.% <.¥ *. The reason for this is that we can 


+ We will often write just “by Etaut” meaning to say “by FTaut-implication”. 
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imagine that we eliminate one new symbol at a time, repeatedly applying the 
metatheorem above — part (b) to atomic subformulas — forming a sequence of 
increasingly more “basic” formulas .7;,,,.7;,:-7,_1,++++Zo, where Fo is the 
same string as.¥ * and.¥,,, is the same string as.7. 

Now, Tj41 F.4;,, < F; fori = k,..., 0, where, if a defined function letter 
was eliminated at step i + 1 — i, we invoke (D) above and the Leibniz rule. 
Hence, since [9 CT, C--- © Vea, Teu1  ¥4, o F, fori =k,...,0, 
therefore Ty41 F .F,4) <> Fo. © 


1.7.2 Remark (One Point Rule). The absolutely provable formula in (C) above 
is sometimes called the one point rule (Gries and Schneider (1994), Tourlakis 
(2000a, 2001b)). Its dual 


Filth Wy\(y =t> Fy) 


is also given the same nickname and is easily (absolutely) provable using (C) 
by eliminating 4. 


1.7.3 Metatheorem (Elimination of Defined Symbols: II). Let T be a theory 
over a language L. 


(a) If L' denotes the extension of L by the new predicate symbol P, and Y' 
denotes the extension of V by the addition of the axiom PX, << O\(Xn), 
where @ is a formula over L, then’ + .F for any formula.Y over L such 
that!’ FF. 

(b) Assume that 


TE Aly).A(y, x1,---, Xn) (*) 
pursuant to which we have defined the new function symbol f by the axiom 
y = fxy...X > By, X1,--+, Xn) (2k) 


and thus extended L to L' and T to VY’. Then’. + .F¥ for any formula .F 


es 


over L such that T’ + .F. 


Proof. This metatheorem assures that extensions of theories by definitions are 
conservative in that they produce convenience but no additional power (the 
same old theorems over the original language are the only ones provable). 


(a): By the completeness theorem, we show instead that 


TEKS (1) 
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So let IN = (M, .7) be an arbitrary model of I, ie., let 
Eom TP (2) 


We now expand the structure It into 9M’ = (M,.7’) — without adding any 
new individuals to its domain M — by adding an interpretation P-”’ for the new 
symbol P. We define for every aj,..., da, in M 


P? (ay,..-5Qn) =t iff Eo O@,...,G,) — [ie., iff 
Eo Ca, sy an)] 


Clearly then, Jt’ is a model of the new axiom, since, for all SNt’-instances of 
the axiom — such as P(q@),...,@n) << CO(@,..., Gn) — we have 


(P@i,.-+5Gn) © OG1,.+25G,))” =t 


It follows that Egy I’, since we have Eon I’, the latter by (2), due to having 
made no changes to Jt that affect the symbols of L. Thus, I’ + .¥ yields 
Eon .F; hence, since .Y is over L, we obtain Foy .¥. Along with (2), this 
proves (1). 

(b): As in (a), assume (2) in an attempt to prove (1). By («), 


Fo (Aly) A(y, x1, jaagotn) 


Thus, there is a concrete (i.e., in the metatheory) function fof n arguments that 
takes its inputs from M and gives its outputs to M, the input-output relation 
being given by (3) below (b, in, a out). To be specific, the semantics of “Hd!” 


implies that for all b;,..., b, in M there is a unique a € M such that 
(AG, b,...,bn))? =t (3) 


We now expand the structure Jt into It’ = (M,.7'),! so that all we add to it 
is an interpretation for the new function symbol f. We let f”' = f. From (2) 
it follows that 


Eon P 2’) 
since we made no changes to St other than adding an interpretation of f, and 
since no formula in T’ contains f. By (3), if a, bi,...,b, are any members of 
M, then we have 

Eon a= fb)...b, iffa = f(b1,.-., Pn) 


iff Eon .4(a, bi, ..., Dn) by the definition of ra 


iff Kon 2G, b1,..., Bn) 


+ This part is independent of part (a); hence this is a different .7’ in general. 
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the last “iff” being because .% (over L) means the same thing in Jt and Wt’. 
Thus, 


Eo y= fxy...X, > By, X,---,Xn) (4) 


Now (x), (2’), and (4) yield Egy T°’ which implies Foy .F (from I’ F.F ). 
Finally, since.Y contains no f, we have Foy .¥ . This last fact, and (2) give (1). 


1.7.4 Remark. 


(a) We note that translation rules (1) and (2) — the latter applied to atomic 
subformulas — preserve the syntactic structure of quantifier prefixes. For 
example, suppose that we have introduced f by 


VS fis ee SO Ot (5) 


in set theory. Now, an application of the collection axiom of set theory has 
a hypothesis of the form 


“Wx € Z)\Aw)....-4Lfty...ti]...)” (6) 


where, say, .4 is atomic and the displayed f is innermost. Eliminating this 
Jf, we have the translation 


“Wx € Z)Aw)(... AyD] A 20, thy -+ +5 tn). 6)” (7) 


which still has the Va-prefix and still looks exactly like a collection axiom 
hypothesis. 

(b) Rather than worrying about the “ontology” of the function symbol formally 
introduced by (5) above —i.e., the question of the exact nature of the symbol 
that we named “ f” —in practice we shrug this off and resort to metalinguistic 
devices to name the function symbol, or the term that naturally arises from 
it. For example, one can use the notation “ f7” for the function — where the 
subscript “@” is the exact string over the language that “/” denotes — or, 
for the corresponding term, the notation of Whitehead and Russell (1912), 


(z)O(Z, X1,--5, Xn) (8) 


The “z” in (8) above is a bound variable.' This new type of term is read 
“the unique z such that ...”. This “v” is not one of our primitive symbols.* 


+ That it must be distinct from the x; is obvious. 
= It is however possible to enlarge our alphabet to include “:”, and then add definitions of the 
syntax of “i-terms” and axioms for the behaviour of “v-terms”. At the end of all this one gets a 
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It is just meant to lead to the friendly shorthand (8) above that avoids the 
“ontology” issue. Thus, once one proves 


ADO Gis ix XA) (9) 
one can then introduce (8) by the axiom 
y = (2Z)O(Z, 1, --- 5 Xn) > OG, M1, +--+ Xn) (5’) 


which, of course, is an alias for the axiom (5), using more suggestive nota- 
tion for the term fx ,..., X,. By (9), the axioms (5) or (5) can be replaced 
by 


OP, 066s Xny X15 «++ 4 Xn) 
and 
OUD (Z, X1,-+5, Xn)s Nise, Xn) (10) 


respectively. For example, from (5’) we get (10) by substitution. Now, Ax4 
(with some help from F taut) yields 


OUD (Z, M1, See Xn); X1, xo, Xn) Pe 
y = (tz)O(Z, x1, oes Mn) —? O(y, X15 +++5Xn) 


Hence, assuming (10), 
y = ((2)O(Z, 1, +--+, Xn) > COL X1,.--, Xn) (11) 
Finally, deploying (9), we get 


OU, x1, ep Pee »Xn)s x1, pre Xn) =F 
Oy, x1, oon. Xp) = ,= (2) (Z, 1, spat Xp) 


Hence 
Oy, x1, tee ,Xn) am y= (Z)O(Z, X1, tee »Xn) 
by (10). This, along with (11), yields (5’). ® 


© © The Indefinite Article. We often have the following situation: We have proved 
a statement like 


(Ax). 4[x] (1) 


conservative extension of the original theory, i.e., any :-free formula provable in the new theory 
can be also proved in the old (Hilbert and Bernays (1968)). 
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and we want next to derive a statement .7%. To this end, we start by picking a 
symbol c not in.# and say “let c be such that .4[c] is true”.i That is, we add 
_A[c] as a nonlogical axiom, treating c as a new constant. 

From all these assumptions we then manage to prove .7, hopefully treat- 
ing all the free variables of .4[c] as constants during the argument. We then 
conclude that .# has been derived without the help of Z[c] or c (see 1.4.27). 

Two things are noteworthy in this technique: One, c does not occur in the 
conclusion, and, two, c is not uniquely determined by (1). So we have a (rather 
than the) c that makes . Z[c] true. 


Now the suggestion that the free variables of the latter be frozen during the 
derivation of .7 is unnecessarily restrictive, and we have a more general result: 
Suppose that 


TF Gx)4, y1,---5 Yn) (2) 


Add a new function symbol f to the language L of I (thus obtaining L’) via 
the axiom 


LAL Vis csig Vas Vy Ya) (3) 
This says, intuitively, “for any y1,..., Yn, let x = f¥, make. A(x, yp) true”. 


Again, this x is not uniquely determined by (2). 


Finally, suppose that we have a proof 
P+ ALYn Ind B (4) 


such that f, the new function symbol, occurs nowhere in .%, i.e., the latter 
formula is over L. We can conclude then that 


Tt @ (5) 


that is, the extension T +.A(f Yn, Yn) of I is conservative. 


A proof of the legitimacy of this technique, based on the completeness 
theorem, is easy. Let 


Fm I (6) 
and show 

Em 2 (7) 
Expand the model 2 = (M,.7) to Mt’ = (M,.7’), so that .7’ interprets 


the new symbol f. The interpretation is chosen as follows: (2) guarantees 


+ Cf. 15.44. 
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that, for all choices of i,,...,i, in M, the set S(ij,...,i,) = {a € M: 
Ko AG, i1,..., in)} is not empty. 

By the axiom of choice (of informal set theory), we can pick ana(i,,..., in)! 
in each S(ij,...,i,). Thus, we define a function f: M" — M by letting, for 
each i,...,i, in M, fli, ..-,in) = ait, ..-, in). 

The next step is to set 


(oe e 
Therefore, for alli,,...,7, in M, 
(fiat =f Gist Satna) 
It is now clear that Eon -A(f¥1 .-- Yn, V1,--+> Yn), for, by 15.11, 


(Afit..-init,...,i))? =to 4aG,..cinn it...i)” =t 


and the right hand side of the above is true by the choice of a(i,,..., in). 

Thus, Eon DP +.4( fy... Yn V1, +--+ Yn); hence Egy 7, by (4). Since 7 
contains no f, we also have gn .7; thus we have established (7) from (6). We 
now have (5). 

One can give a number of names to a function like f: a Skolem function, 
an €-term (Hilbert (1968)), or a t-term (Bourbaki (1966b)). In the first case 
one may ornament the symbol f, e.g., f4.z, to show where it is coming from, 
although such mnemonic naming is not, of course, mandatory. 

The last two terminologies actually apply to the term fy, ...y,, rather than 
to the function symbol f. Hilbert would have written 


(ex) A(X, Y1.-+ 5 Yn) (8) 
and Bourbaki 
(tx)AX, V1 +. + Yn) (9) 


— each denoting fy: ...y,. The “x” in each of (8) and (9) is a bound variable 
(different from each y;). 


1.8. Computability and Uncomputability 


Computability (or “recursion theory”) is nowadays classified as an area of logic 
(e.g., it is one of the areas represented in the Handbook of Mathematical Logic, 
Barwise (1978)). It has its origins in the work of several logicians in the 1930s 


} The SCisats8 in)” part indicates that “a” depends on i1,..., tas 
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(Gédel, Turing, Kleene, Church, Post, et al.). Motivation for this research was 
partly provided by Hilbert’s program to found all mathematics on formalism. 
This was a formalism that one ought to be able to certify by finitary means 
(for each particular formalized theory) to be free of contradiction. Moreover, 
it was a formalism, for which — Hilbert expected — a “method” ought to exist 
to solve the Entscheidungsproblem (decision problem), that is, the question “is 
this arbitrary formula a theorem, or not?” 

What was a “method” supposed to be, exactly, mathematically speaking? 
Was the expectation that the Entscheidungsproblem of any theory is amenable 
to algorithmic solution realistic? Work of Church (lack of a decision algorithm 
for certain theories (1936)) showed that it was not, nor for that matter was 
the expectation of certifying freedom of contradiction of all formal theories by 
finitary means (Gédel’s second incompleteness theorem). 

One of these two negative answers (Church’s) built on an emerging theory 
of computable (or algorithmic) functions and the mathematical formulation of 
the concepts of algorithm or method. The other one, Gédel’s, while it used 
existing (pre-Turing and pre-Kleene) rudiments of computability (primitive 
recursive functions of Dedekind), can be recast, in hindsight, in the framework 
of modern computability. This recasting shows the intimate connection between 
the phenomena of incompletableness of certain theories and uncomputability, 
and thus it enhances our understanding of both phenomena. 

With the advent of computers and the development of computer science, 
computability gained a new set of practitioners and researchers: theoretical 
computer scientists. This group approaches the area from two (main) stand- 
points: to study the power and limitations of mathematical models of computing 
devices (after all, computer programs are algorithms), and also to understand 
why some problems have “easy” while others have “hard” algorithmic solutions 
(complexity theory) — in the process devising several “practical” (or efficient) 
solutions, and techniques, for a plethora of practical problems. 

We develop the basics of computability here informally, that is, within “real 
mathematics” (in the metatheory of pure and applied first order logic). 

Computability, generally speaking, formalizes the concept of a “computable 
function” f : N‘ + N. Thatis, it concerns itself with the issue of separating the 
set of all so-called number-theoretic functions — that is,‘ functions with inputs 
in N and outputs in N — into computable and uncomputable. 

Because we want the theory to be as inclusive as possible, we allow it to 
study both total and nontotal functions f : N‘ > N. 


+ More precisely, this is what ordinary computability or ordinary recursion theory studies. Higher 
recursion theory, invented by Kleene, also looks into functions that have higher order inputs such 
as number-theoretic functions. 
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The trivial reason is that in everyday computing we do encounter both total 
and nontotal functions. There are computer programs which (whether or not 
according to the programmer’s intent) do not stop to yield an answer for all 
possible inputs. We do want to have formal counterparts of those in our theory, 
if we are hoping to have a theory that is inclusive. 

A less trivial reason is that unless we allow nontotal functions in the theory, an 
obvious diagonalization can show the existence of total (intuitively) computable 
functions that are not in the theory. 


1.8.1 Definition. Any number-theoretic function f : N‘ > N is a partial 
function. If its domain, dom(f), equals N* — the set of all potential inputs, or 
left field — then we say that f is total. If it does not, then f is nontotal. 

That a € dom(f) is also denoted by f(a) |, and we say that f is defined at 
a or that f(a) converges.‘ In the opposite case we write f(a) + and say that f 
is undefined at a or that f(a) diverges. 

A number-theoretic relation is a subset of N*. We usually write such relations 
in relational notation. That is, we write R(a,,...,d,) for (a},...,@n) € R. 
Thus our notation of relations parallels that of formulas of a first order language, 
and we use the logical connectives (4, V, 7, V, etc.) informally to combine 
relations. We carry that parallel to the next natural step, and use the phrases 


“".. arelation R...” and“... arelation R(y1,..., yn)...” interchangeably, 
the latter to convey that the full list of the relation’s variables is exactly y1,..., Vn 
(cf. p. 18). 

We occasionally use A-notation to modify a given relation R(y1,..., Yn)- 
This notation is employed as in Az,...z,.R, or even AZ1...2,--R(y1,---5 Yn). 
The part “Az, ...z,.” denotes that “z,,..., z,” is the active variables list and 
supersedes the list “y,,..., y,”. Any y, that is not in the list z|,..., z, is 
treated as a constant (or “parameter” — i.e., it is “frozen’’). The list z,,..., z, 
may contain additional variables not in the list y;,..., yp. 


Thus, e.g., Axy.x < 2 = {0,1} x N, while Ayx.x < 2 = N x {0, 1}. On 
the other hand, Ax.x < y = {x : x < y}, which denotes a different relation for 
different values of the parameter y. 

Finally, as before, z, or just Z (if r is understood) denotes z;,..., Z-, So that 
we may write AZ,.R(y1,.-.5 Yn) 


1.8.2 Definition (Bounded Quantification). For any relation R, the symbols 
(Ax)<,R, (Wx)<,R, (Ax)<,R, (Vx)<,R stand for (Ax)(x < zA R), (Vx)(x < 
z—> R), Gx)x < zA R), (Vx)(x < z > R), respectively. We say that they 
denote bounded quantification. 


+ This nomenclature parallels that of “convergent” or “halting” computations. 
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1.8.3 Definition. If R C N” is a relation and f : N‘ — N a function, then 
R(w,..., Wm, f (Xk), Wnts +», Wn) means 


(az)(R(w1, +e he Wm, zy Wm+1> +e ey Wn) AZ — f Xe) 


We have just one important exception to this rule: If Q is g(y) = w, then 
g(y) = f (X,) means 


gy) t AF AK) t VEZz = BV) Az = F(X) 


One often writes g(y) ~ f(X,) for the above to alert the reader that “weak 
equality” (a notion due to Kleene) applies, but we will rather use “=” throughout 
and let the context determine the meaning. 


ee Clearly, weak equality restores reflexivity of “=” (which fails if the general 
understanding of substitution above applied to “=” as well). 


A-notation comes in handy in denoting number-theoretic functions. Instead 
of saying “consider the function g obtained from f : N‘ > N, by setting, for 
all Wm, 


8(Wm) = f Xx) 
where if an x; is not among the w; it is understood to be an (unspecified) 


constant”, we simply say “consider g = AWm.f (X,)”. © 


1.8.4 Example. Turning to inequalities, f(x) >0 means (is equivalent to) 
(Ay)(y = f(x) A y > 0). In particular, it implies that f(x) |. © 


1.8.5 Example. In the presence of partial functions, -A = B and A ¥ B are 
© not interchangeable. For example, f(a) 4 b says (by I.8.3) that (dy)( f(a) = 
yAy # b). In particular, this entails that f(a) |.On the other hand, — f(a) = b 
holds iff f(a) + V@y\( f(a) = y Ay #D). 
We are not changing the rules of logic here, but are just amending our 
understanding of the semantics of the metanotation “+4”, to make it correct in 


the presence of partial functions. © 


There are many approaches to defining computable functions, and they are 
all equivalent, that is, they define exactly the same set of functions. All except 
two of them begin by defining a notion of “computation model’, that is, a set 


+ Cf. 1.7.2. 
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of string-manipulation algorithms (e.g., Turing machines, Markov algorithms, 
Kleene’s equation manipulation processes), and then they define a computable 
function as one whose input-output relationship — coded as arelation on strings — 
can be verified by an algorithm belonging to the computation model. 

There are two number-theoretic approaches, both due to Kleene, one using 
so-called Kleene schemata‘ and one that inductively defines the set of com- 
putable functions, bypassing the concepts of “algorithm” or “computation”.+ 

We follow the latter approach in this section. According to this, the set 
of computable functions is the smallest set of functions that includes some 
“indisputably computable” functions, and is closed under some “indisputably 
algorithmic” operations.’ 

The following are operations (on number-theoretic functions) that are cen- 
trally important: 


1.8.6 Definition (Composition). Let Ax.g;(x) @ = 1,...,) and Ayn. fn) 
be given functions.’ Then h = Ax. f(g1(X),..., 2n(%)) is the result of their 
composition. 


© Note the requirement that all the variables of the “outermost” function, f, be 
substituted, and that each substitution (a function application, g;(x)) apply to 
the same variable list x. With additional tools, we can eventually relax this very 


rigid requirement. © 


1.8.7 Definition (Primitive Recursion). Let Axj,z.g(x, Yn, Z) and AV,.ACn) 
be given. We say that Ax¥,. f(x, ¥,) is obtained by primitive recursion from 
h and g just in case it satisfies, for all x and y,, the following equations (the 
so-called primitive recursive schema): 


FO, Yn) = h(n) 
f(x +41, Yn) = a(x, Yn» f(x, Yn)) 


1.8.8 Definition (Unbounded Search). Given AX y,.g(x, yn). f is defined from 
g by unbounded search on the variable x just in case, for all y,,, the following 


+ 


These characterize inductively the set of all number-tuples (z, X, y) which are intuitively under- 
stood to “code” the statement that the machine, or algorithm, z, when presented with input x, 
will eventually output y. 

Work on this originated with Dedekind, who characterized in this manner a proper subset of 
computable functions, that of primitive recursive functions. 

The reader will agree, once all the details are in hand, that the qualification “indisputably” is apt. 
A function in this section, unless otherwise explicitly stated, is anumber-theoretic partial function. 


++ 


—& wm 
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holds: 


min{x : 2(X, Yn) =O A (Wz) <x 2(Z, Yn) V} 
+ if the minimum above does not exist 


fOn) = | (1) 


In (1) above, the case “t’” is short for “ f(¥,) is undefined”. We write f (jn) = 
(44x) g(X, Yn) as a short form of (1). 


© 1.8.9 Example. The condition “g(x, ¥,)=0A(Vz)<,2(Z, Jn) |” is rather com- 
plicated. It says that (see also 1.8.4) 


g(0, Yn) > 0, (1, Yn) > 0,..., g(x —1, yn) > 0 
but g(x, ¥,) = 0. For example, suppose that 


0 ifx=y=l1 
+ — otherwise 


ga, y= 


Then, while the smallest x such that g(x, 1) = 0 holds is x = 1, this is not what 
(1) “computes”. The definition (1) yields undefined in this case, since g(0, 1) t. 


Of course, the part “(Vz)<,2(Z, ¥n) |” in (1) is superfluous if g is total. © 


The following functions are intuitively computable. They form the basis of 
an inductive definition of all computable functions. 


1.8.10 Definition (Initial Functions). 


Zero:Z (Ax.0) 
Successor:s  (Ax.x + 1) 
Identities or projections: u?, forn > 1land1<i<n (AXp Xj). 


1.8.11 Definition. The set of partial computable or partial recursive functions, 
3B, is the closure of the initial functions above, under the operations composition, 
primitive recursion, and unbounded search. 


The set of computable or recursive functions, M, is the set of all total functions 


of $B. 


One occasionally sees terminology such as “computable partial functions” or 
@ “recursive partial functions”. Of course, “partial” qualifies “functions” (not “re- 

cursive” or “computable”’): therefore one hopes never to see “partially recursive 

functions” or “partially computable functions”. er 
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1.8.12 Definition. The set of primitive recursive functions, BH, is the closure 
of the initial functions above under the operations composition and primitive 


recursion. 


The primitive recursive functions were defined by Dedekind and were called 
© “recursive” until the recursive functions of I.8.11 were defined. Then the name 

of the functions of Dedekind was qualified to be “primitive”. 

Why are the functions in 8 “computable’”?' Well, an (informal) induction 
on the definition (1.8.11) shows why this is “correct”. 

The initial functions are clearly intuitively computable (e.g., by pencil and 
paper, by anyone who knows how to add | to an arbitrary natural number). 

Suppose that each of AXx.g;(x) iG = 1,..., 2) and AY,. f (Yn) are intuitively 
computable (i.e., we know how to compute the output, given the input). To 
compute f(2\(a),..., £n(@)), given a, we compute each of the g;(a@), and then 
use the results as inputs to f. 


To see why f (defined by a primitive recursive schema from h and g) is 
computable if h and g are, let us first introduce the notation z := x, which we 
understand to say “copy the value of x into z”. 


Then we can write an “algorithm” for the computation of f(a, bn): 
(1) z:=h@,) 
Repeat (2) below fori = 0,1,2,...,a— 1: 
(2) z:= gli, bn, z) 


Since (I.H.) the computations h(Dn) and g(i, Dn , z) canbe carried out—regardless 
of the input values b,, i ,and z—at the end of the “computation” indicated above, 
z holds the value f(a, b,). 

Finally, let Ax¥,.g(x, Yn) be intuitively computable. We show how to com- 
pute AVn.(Ux)g(x, Yn): 


(1) x :=0. 
(2) if g(x, b,) = 0, go to step (5). 
(3) x:=x+1. 


(4) go back to step (2). 
(5) Done! x holds the result. 


The above algorithm justifies the term “unbounded search”. We are searching 
by letting x = 0,1, 2,... in turn. It is “unbounded” since we have no a priori 


+ We have “computable” and computable. The former connotes our intuitive understanding of the 
term. It means “intuitively computable”. The latter has an exact definition (1.8.11). 
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knowledge of how far the search will have to go. It is also clear that the algorithm 
satisfies the definition of (4x):' We will hit step (5) iff progress was never 
blocked at step (2) (.e., iff all along g(i, Dn) > 0 (see 1.8.4) until the first 
(smallest) i came along for which g(i, Dn) = 0). © 


We have our first few simple results: 


1.8.13 Proposition. Cc PB. 


Proof. The C-part is by definition. The 4 -part follows from the fact that e € 
$B but e ¢ KR, where we have denoted by “e” the totally undefined (empty) 
function Ay.(uux)s(ui (x, y)) (in short, e(y), for any y, is the smallest x such 
that x + 1 = 0; but such an x does not exist). 


1.8.14 Proposition. ‘8 is closed under composition and primitive recursion. 


Proof. These two operations preserve total functions (why?). 


1.8.15 Corollary. BR C KR. 


Proof. By induction on 8%, since the initial functions (common to 5B9% and 
§8) are total and hence are in KR. 


& Thus all primitive recursive functions are total. 


It can be shown that the inclusion 5B C MK is proper, but we will not need 
this result (see, e.g., Tourlakis (1984)). 


1.8.16 Definition. A relation R(x) is (primitive) recursive iff its characteristic 
function, 
49 0 if R(x) 
AES bh JESRG) 
is (primitive) recursive. 


The set of all primitive recursive (recursive) relations, or predicates,* is 
denoted by BHR,, (9%... 


1 By the way, in modern Greek, one pronounces “jz” exactly like the English word “me”. 
? Relations are often called “predicates” by computability practitioners. 
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Since we are to stay within N, we need a special kind of subtraction, proper 
subtraction:' 


- def Jx—-y ifx>y 
CI fF otherwise 
© 1.8.17 Example. This example illustrates some important techniques used to 
circumvent the rigidity of our definitions. 
We prove that Axy.x — y € 8M. First, we look at a special case. Let p = 
Ax.x — land D = Axy.p(x). Now P is primitive recursive, since 


PO, y) = z(y) 


Px t+ Ly) = ui, y, BO, y) oe 
Thus, so is 
p = Ax.p(uj(x), z(@)) (2) 
Finally, let d = Axy.y — x. This is in BH, since 
1 
d(0, y) = uy(y) (3) 


d(x + 1, y) = p(u3(x, y, d(x, y))) 
Thus, Axy.x — y is primitive recursive, since 


AXY.X a y= axy.d(ux(x, y), ur(x, y)) (4) 
Our acrobatics here have worked around the following formal difficulties: 


(i) Our number-theoretic functions have at least one argument. Thus, any 
instance of the primitive recursive schema must define a function of at 
least two arguments. This explains the introduction of Pp in the schema (1). 

(ii) A more user-friendly way to write (1) (in the argot of recursion theory) is 


pO) =0 
pxt+l=x 


Indeed, ae, y, p(x, y))” is a fancy way (respecting the form of the 
primitive recursive schema) to just say “x”. Moreover, one simply writes 
p = Ax.p(x, 0) instead of (2) above. 

(iii) Finally, (3) and (4) get around the fact that the primitive recursion schema 
iterates via the first variable. As this example shows, this is not cast in 


stone, for we can swap variables (with the help of the u?). 


+ Some authors pronounce proper subtraction monus. 
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One must be careful not to gloss over this last hurdle by shrugging it off: 
“What’s inaname?”’. It is not a matter of changing names everywhere to go from 
Axy.x — y to Ayx.y — x. We actually needed to work with the first variable in 
the i-list, but (because of the nature of ““—”) this variable should be after ee 
That is, we did need d = Axy.y — x. 


In argot, (3) takes the simple form 


x—-O=x 
x-(y+t)=@-y)-1 


The reader must have concluded (correctly) that the argot operations of per- 
muting variables, identifying variables, augmenting the variable list with new 
variables (also, replacing a single variable with a function application or a con- 
stant) are not argot at all, but are derived “legal” operations of substitution (due 
to Grzegorczyk (1953) — see Exercise 1.68). 

Therefore, from now on we will relax our notational rigidity and benefit 
from the presence of these operations of substitution. 


1.8.18 Example. Axy.x+y, Axy.x x y (or, in implied multiplication notation, 
Axy.xy), and Axy.x” are in SB. Let us leave the first two as an easy exercise, 
and deal with the third one, since it entails an important point: 
xo = 1 
tl xx x? 

The “important point” is regarding the basis case, x° = 1. We learn in “ordinary 
math” that 0° is undefined. If we sustain this point of view, then Axy.x” cannot 
possibly be in 8% (why?). So we re-define 0° to be 1. 

One does this kind of re-definition a lot in recursion theory (it is akin to 
removing removable discontinuities in calculus) when a function threatens not 
to be, say, primitive recursive for trivial reasons. 


A trivial corollary is that Ax.0* € $8 (why?). This is a useful function, 
normally denoted by sg. Clearly, 


1 ifx=0 
0 otherwise 


sg(x) = | 


We also see that 5g(x) = 1 — x, which provides an alternative proof that 4.x.0* € 


pH. ® 
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& 1.8.19 Example. 


y ifx=0 


axyz. 
ee ifx £0 


is in BM. This function is often called the “switch” or “if-then-else”, and is 
sometimes denoted by the name “sw”. 
We rest our case, since 


sw, y,z)=y 
swxt+l,y,z)=2z 


We see immediately that sw(x, 1,0)=0* =1 — x. The function Ax.sw(x, 
0, 1)=Ax.1 — (1 — x) has a special symbol: “sg”. It is often called the signum, 
since it gives the sign of its argument. & 


1.8.20 Lemma. R(x) is in BR, (respectively, R,) iff, for some f € PR (res- 
pectively, f € ®), RX) @ f(x) = 0. 


Proof. Only-if part: Take f = xr. If part: xr = AX.sg(f(X)). 


1.8.21 Theorem. BH, (respectively, K,) is closed under replacement of vari- 
ables by primitive recursive (respectively, recursive) functions. 


Proof. If xp is the characteristic function of R(X, y, Z) and f is a total function, 
then AXWZ.xp(X, f (Ww), Z) is the characteristic function of R(x, f(w), Z). (See 
also Exercise I.68.) 


1.8.22 Theorem. BH, and K,. are closed under Boolean connectives (“Bo- 
olean operations”) and bounded quantification. 


Proof. It suffices to cover =, V, (Ay) <,. We are given R(x), Q(y¥), and P(y, x), 
all in BR,, (or K,,; the argument is the same for both cases). 


Case for 7: xXar = AX.58(xR(X)). 

Case forV:  Xrvo = AXY.XR(X)Xo(¥) (where we have used implied mul- 
tiplication notation). 

Case for (Ay)<,: To unclutter the notation, let us denote by x3p the char- 
acteristic function of (Ay)-, P(y, x). Then 


Xap (0, x) = 1 
Xap(z +1, x) = xp(z, X)xap(z, x) 
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1.8.23 Remark. (1) The reader can convince himself that quantifying over the 
first variable was only for the sake of notational convenience. 
(2) The case for (4y)<, (and therefore for (Vy)<,) can be easily made: 


(Ay)<, O(y, ¥) > Ay) e241 O(Y, X) 


(here we have used I.8.21). ® 


1.8.24 Example (Bounded Summation and Multiplication). We are collect- 
ing tools that will be useful in our arithmetization of 58. Two such tools are the 
operations )°,_. f(y, x) and [],_, f(y, x). Both BR and MR are closed under 
these operations. For example, here is the reason for |]: 


[[fo.8=1 
y<0 
I] fo.) = 6H] [£0.9 
y<zt+l y<Z 


1.8.25 Definition (Bounded Search). For a total function Ayx.g(y, x) we 
define for all x 


=, def {min{y : y <zA g(y, xX) = 0} 
(uy)<c8(¥, X) = ie if the minimum does not exist 


The symbol (jzy)<,g(y, X) is defined to mean (Wy) <241g(y, Xx). 


& Bounded search, (jy) -,, searches a predetermined domain, 0,1,...,z—1.If 
unsuccessful, it returns the first number to the right of the domain. & 


We extend the use of search on predicates: 


1.8.26 Definition. For a predicate R(y, x), the symbols (uy)R(y, x), (Ly) <z 
Ry, %), and (uy)e:R(y,%) mean (wy)xXR(V,¥), (Uy)<cxa(s¥), and 
(Wy)<:XR(y, X) respectively. 


1.8.27 Theorem (Definition by Cases). 9 and SBR are closed under the sche- 
ma of definition by cases (below), where it is understood that the relations R; 
are mutually exclusive: 

six) if Rix) 

g(x) if Ro(x) 

fQ@)= 
g(x) if Re(X) 
8k41(X) otherwise 
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Proof. Perhaps the simplest proof is to observe that 


F%) = Bi @)5B(XR(%)) H+ + Be C)TE (KR) + Be+1*)SB(XO(X)) 


a fixed-length sum, where Q <> —(R, V--- V Rx). 


1.8.28 Theorem. ‘8 and SBR are closed under bounded search. 


Proof. Let 9(z, X) = (wy)<, f(y, X). Then 


2(0, x) =0 
e(z+1,x) = if g(z, x) ¢ z then g(z, x) 
else if f(z, x) = 0 then z 
else z+ 1 


The second equation above is g(z + 1, x) = k(z, X, g(z, X)), where 


k(z, x, w) =if w ~ z then w 
else if f(z, x) = 0 then z 
else z+ 1 


Clearly k is wherever f is (in % or BM), since (see 1.8.19) sw € PR. 
1.8.29 Proposition. The following are all primitive recursive: 


(i) Axy. H (the quotient of the division ~) 
y y 
(ii) Axy.rem(x, y) | the remainder of the division =) 
y 
(iii) Axy.x|y (“x divides y”) 
(iv) Pr(x) (x is a prime) 
(v) An. py (the n th prime) 
(vi) Anx.exp(n, x) (the exponent of Pp, in the prime factorization of x) 
(vii) Seq(x) (“x’s prime number factorization contains at least one prime, but 
no gaps”) 


Proof. (i): 


Hi = (uz)ey((z + Dy > x) (1) 


(1) is correct for all y 4 0. Since we do not want the quotient function to 
fail primitive recursiveness for a trivial reason (we have a “removable 
nontotalness” — see also I.8.18), we define |x/y] to equal the right hand side 
of (1) at all times (of course, the right hand side is total). 
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(ii): rem(x, y) =x - Lx/yly. 
(iii): x|y > rem(y, x) = 0. 
(iv): Prx)ex>1AWVyexQlx > y=1Vy=x). 
(v): 
Po =2 
Poti = (MY)cont' (Pr(y) A y > Pn) 


The above is based on Euclid’s proof that there are infinitely many primes 
(poP1-+: Pn + lis either a prime, g > Py+1, or it has a prime divisor g > Pn+1) 
and an induction on n that shows p;, < ge 

(vi): exp(n, x) = (Ly )ex(>(Pn"' |). 

(vii): Seq(x) > x > LAW y)ex(VZ) <x |XAPrO)APr(Z)AZ < y > 2|x). 


1.8.30 Definition (Coding and Decoding Number Sequences). An arbitrary 


(finite) sequence of natural numbers do, a1, ..., @n—1 Will be coded as 
ao+l a;+1 ay—-1+1 
0 Pr *** Pa-1 


We use the notation 


def ytl 
(G0Oiy cede = | Dy” (1) 


y<n 


In set theory one likes to denote tuples by (ag, ..., dn—1) as well, a practice 
© that we have been following (cf. Section I.2). To avoid notational confusion, in 

those rare cases where we want to write down both a code (do, ..., Gn_1) of a 

sequence dg, ..., d,-; and an n-tuple in set theory’s sense, we write the latter 

in the “old” notation, with round brackets, that is, (dg, ..., @y—1). & 


Why “+ 1” in the exponent? Without that, all three sequences “2”, “2,0”, 
and “2,0, 0” get the same code, namely 27. This is a drawback, for if we are 
given the code 2” but do not know the length of the coded sequence, then we 
cannot decode 2? back into the original sequence correctly. Contrast this with 
the schema (1) above, where these three examples are coded as 23, 23-3 and 
23 . 3 - 5 respectively. We see that the coding (1) above codes the length n of 
the sequence do, a1, ..., An—1 into the code z = (do, a1, ..., n—1). This length 
is the number of primes in the decomposition of z (of course, Seq(z) is true), 
and it is useful to have a function for it, called “/h”. There are many ways to 
define a primitive recursive length function, /h, that does the job. The simplest 
definition allows /h to give nonsensical answers for all inputs that do not code 
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sequences. Examples of such inputs are 0, 1, 10; in short, any number z such 
that —Seq(z). We let 


IN(Z) = (My)<z(7Py|2) 
Clearly, /h € BR. 


From all the above, we get that Seq(z) iff, for some do, a1,..., @n—1, Z = 
(ao, @1,---, An—1) (this justifies the mnemonic “Seq” for “sequence’’). 

Clearly, /h(z) = n in this case, and! exp(i, z) —1=a;fori =0,1,...,n—1 
(exp(i, z) — 1 = Oifi > n). 

It is customary to use the more compact symbol 


(2): = expt, 2) = 1 
Thus, if Seq(z), then the sequence (z);, fori = 0,..., /h(z) — 1, decodes z. 
We will also need to express sequence concatenation primitive recursively. 


We define concatenation, “x”, by 


def 

(do, tees Qn-1) * (bo, a) Din-1) aa (do, s++y4n-1, bo, tees Dim—1) (2) 
Of course, for Axy.x * y to be in BR we must have a total function to begin 
with, so that “x”? must be defined on all natural numbers, not on just those 

satisfying Seq. 

The following definition is at once seen to satisfy all our requirements: 
def exp(i,y) 
ec ca a I] Pitth(x) (3) 
i<lh(y) 


1.8.31 Theorem (Course-of- Values (Primitive) Recursion). Let H(x, y,), the 
history function of f, stand for (f (0, ¥n), ..., f(X, Yn)) for x = 0. 
Then BR and K are closed under the following schema of course-of-values 
recursion: 
fO, Yn) = h(n) 
fat, Yn) = g(x, ns A(x, Yn)) 


Proof. It follows from the (ordinary) primitive recursion 


H(O, Yn) = (AQn)) 
A(x +1, Yn) = H(x, Yn) * (f(x +1, Yn)) 
= A(x, Yn) * (g(x, Yas A(x, Yn))) 


and f(x, ¥n) = (A(x, ¥n))x- 


+ Our definition gives /h(O) = 1, /h(1) = 0, /h(10) = 1. 
= Since Seq(z), we have exp(i, z) — 1 = exp(i, z) — 1 fori =0,1,..., n-1l. 
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We next arithmetize SB-functions and their computations. We will assign 
“program codes” to each function. A program code — called in the literature a 
Gédel number, or a -index, or just an index — is, intuitively, a number in N that 
codes the “instructions” necessary to compute a $8-function. 


® Ifi € Nis ai code for f € %, then we write 
f = {i} (Kleene’s notation) 
or! 
f=¢i (Rogers’s (1967) notation) 
Thus, either {i} or ¢; denotes the function with code i. 
The following table indicates how to assign Gédel numbers (middle column) 


to all partial recursive functions by following the definition of ‘B. In the table, 
f indicates a code of f: 


Function Code Comment 
Ax .0 (0, 1, 0) 
Ax.x+1 (0, 1, 1) 
NXp Xj (0, n, i, 2) l<i<n 
Composition: (l,m, f,21,---.8) f must be n-ary 
f(21Om), Ce) 8nOm)) 8i must be m-ary 


Primitive recursion from 
basis h and iterated part g (2,n+1,4,2) h must be n-ary 
g must be (nm + 2)-ary 
Unbounded search: (3,n, f) f must be (n + 1)-ary 
(uy) f(y, Xn) andn > 0 


We have been somewhat loose in our description above. “The following table 
@ indicates how to assign Gédel numbers (middle column) to all partial recursive 
functions by following the definition of $B”, we have said, perhaps leading the 
reader to think that we are defining the codes by recursion on $8. Not so. After 

all, each function has infinitely many codes. 
What was really involved in the table — see also below — was arguing back- 
wards: a specification of how we would like our ¢-indices behave once we 


1 The indefinite article is appropriate here. Just as in real life a computable function has infinitely 
many different programs that compute it, a partial recursive function f has infinitely many 
different codes (see 1.8.34 later on). 

That is where the name “-index” comes from. 
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obtained them. We now turn to showing how to actually obtain them by di- 
rectly defining the set of all @-indices, ®, as an inductively defined subset of 
{z : Seq(z)}: 


® = CI.7,.#) 


where .7 = {(0, 1,0), (0, 1, 1)} U {(0,7,7,2):n >OAI1 <i <n}, and the 
rule set .% consists of the following three operations: 


(i) Coding composition: Input a and b; (i = 1, ...,) causes output 
(1,m,a,by,..., Dn) 
provided (a); = n and (b;); = m, fori = 1,...,n 
(ii) Coding primitive recursion: Input a and b causes output 
(2,n + 1, a,b) 
provided (a); = n and (b); =n +2. 
(iii) Coding unbounded search: Input a causes output 
(3,n, a) 
provided (a); =n+ 1 andn > 0.1 


By the uniqueness of prime number decomposition, the pair (7, .#) is un- 
ambiguous (see [.2.10, p. 24). Therefore we define by recursion on ® (cf. 1.2.13) 
a total function Aa.{a} (or Aa.¢q)* for each a € ®: 


{(0, 1, O)} = Ax.0 
{(0, 1, 1)} =Ax.x+1 
{(0, n, i, 2)} = AX, x; 
{(1, m, a, bi, .. +, Bn)} = AVm La} {B1} Om), +++» {Bn}Qm)) 
{(2,n + 1,a, b)} = Axy,.Prec({a}, {b}) 
{(3, 2, a)} = AXq (uy ){a}(y, Xn) 


In the above recursive definition we have used the abbreviation Prec({a}, {b}) 
for the function given (for all x, y,) by the primitive recursive schema (1.8.7) 


with h-part {a} and g-part {b}. 


We can now make the intentions implied in the above table official: 


t By an obvious I.H. the other cases can fend for themselves, but, here, reducing the number of 
arguments must not result in 0 arguments, as we have decided not to allow 0-ary functions. 
t The input, a, is a code; the output, {a} or dg, is a function. 
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1.8.32 Theorem. 8 = {{a}: a € ®}. 


Proof. C-part: Induction on 58. The table encapsulates the argument diagram- 
matically. 

>-part: Induction on ®. It follows trivially from the recursive definition 
of {a} and the fact that $8 contains the initial functions and is closed under 
composition, primitive recursion, and unbounded search. 


eon Remark. (Important.) Thus, f € 8 iff for some a € ®, f = {a}. © 


1.8.34 Example. Every function f € ‘8 has infinitely many ¢-indices. Indeed, 
let f = {f}. Since f = AX,.uj(f (Xn), we obtain f = {(1,n, (0, 1, 1,2), f)} 
as well. Since (1,1, (0,1, 1,2), f) > f, the claim follows. 


1.8.35 Theorem. The relation x € ® is primitive recursive. 


Proof. Let x denote the characteristic function of the relation “x € ®”. Then 


x(0) = 1 

xa+)=0if x«+1=(0,1,0) vx+1= (0,1, 1) Vv 
(An, i)exn >OA0<i<nAx+1= (0,n,i,2)) Vv 
(da, b, m, n)<x.(x(a) = 0A (a), =n A Seqg(b) A 
Ih(b) =n A (Wi)en(X((d)i) = OA 
((b);)) =m)Ax+1= (1,m,a)*b)v 
(da, b, n)<x(x(a) = 0A (a)) =n A x(b) = OA 

(b)) =n+2Ax4+1= (2,n+1,a,b))v 
(da, n)<x(xX(a4) =OA(@,=n+1lan>Oa 
x+1= (3,n,a)) 
= 1 otherwise 


The above can easily be seen to be a course-of-values recursion. For example, 
if H(x) = (x(O),..., x(x)), then an occurrence of “x(a) = 0” above can be 
replaced by “(H(x))q = 0”, since a < x. 


& We think oft a computation as a sequence of equations like {e}(a) = b. Such an 
equation is intuitively read as “the program e, when it runs on input a, produces 


+ “We think of” indicates our determination to avoid a rigorous definition. The integrity of our 
exposition will not suffer from this. 
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output b”. An equation will be legitimate iff 


(i) it states an input-output relation of some initial function (i.e., (e€)9 = 0), or 

(ii) it states an input-output relation according to ¢-indices e such that (e)o € 
{1, 2, 3}, using results (i.e., equations) that have already appeared in the 
sequence. 


For example, in order to state (1y){e}(y, Gn) = b — that is, {(3,n, e)}(G,) = 
b — one must ensure that all the equations, 


{e}(b, dn) = 0, {e}(0, Gn) = 10, ---, {e}(b — 1, Gn) = ro-1 


where the r;’s are all non-zero, have already appeared in the sequence. In our 
coding, every equation {a}(a,) = b will be denoted by a triple (e, a,, b) that 
codes, in that order, the @-index, the input, and the output. We will collect (code) 
all these triples into a single code, u = (... , (€,Gn,b),...). 

Before proceeding, let us define the primitive recursive predicates 


(1) Auv.u € v (“v is a term in the (coded) sequence u’”), 
(2) Auvw.v < ,w (“v occurs before w in the (coded) sequence u’”’). 


Primitive recursiveness follows from the equivalences 


v eu < Seqtu) A (Ai) <inw(u)i = v 


V<yWOoUEUAW CUA Cl, fycinw(W)i = UA (CU); = wai < j) © 


We are now ready to define the relation “Computation(u)” which holds iff 
u codes a computation according to the previous understanding. This involves a 
lengthy formula. In the interest of readability, comments enclosed in { }-brackets 
are included in the left margin, to indicate the case under consideration. 


Computation(u)<Seqg(u) A (Vu)<y[v € u > 
{Ax.0} (Ax)<yv = ((0, 1, 0), x, O)V 
fAx.x+ 1} (Ax)evv = ((0,1,1),x,x + 1)v 
(AX, xi} (Ax, n, i)<y{Seq(x) An = IR(x) Ai < nA 
v = ((0,n,i +1, 2)) * x * ((x);)}V 
{composition} (Ax, y, f, 2, m,n, zZ)<u{Seq(x) A Seq(y) A Seq(fyA 

Seq(g) An=Ilhx)An=Ilh(@)Am=IA(y)A 
(Pi = 2A Wi) n(Seq()i) A (Bi) = mA 
v= ({1,m, f) *g) *y * (z)A 
(f) * x (Z) <yVA 
(Vi) en((B)i) * ¥ * ((X)i) < uV}V 
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{prim. recursion} 


{(uy) f(y, Xn)} 


Clearly, the formula to the right of “<>” above we see that Computation(u) is 


primitive recursive. 


1.8.36 Definition (The Kleene 7-Predicate). For each n € N, T(a, Xn, Z) 
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(Ax, yh, &n, c)eu{Seq(h) A (hy): =n A Seq(@)A 
(g)1 =n +2A Seq(y) A lh(y) =n A Seq(c)A 

Ihc) =xtlaAv=((2,n41,h,@), x) *y* (CO)x)A 
(h) * y * ((C)o) <u vA Widex (Bi) * y * (Ci, 

(C)i¢1) < ,v}V . x 

(af, y. xn, n<u{Seq(f) \(f)i =n+1lAn> 0a 
Seq(x) A Ih(x) =n Seq(r) A lh(r) = yA 

uv = ((3,n, f)) *x * (y)A 

(f,y) *x * (0) < vA 


n~ 


(Vi)cy(f, t) * x * (Ki) <uv A(T); > OH 


stands for Computation((z)1) A (a, Xn; (Z)o) € (Z)1- 


The above discussion yields immediately: 


1.8.37 Theorem (Kleene Normal Form Theorem). 


(1) y = {a}Gn) = Gz(TG, Xn, 2) A @o = Y)- 
(2) {a}Gn) = (uz)TOG, Xn, 2))o- 
(3) {a}Gn) L= (A2)T CG, Xn, 2- 


1.8.38 Remark. (Very important.) The right hand side of I.8.37(2) above is 
meaningful for all a € N, while the left hand side is only meaningful fora € ®. 


We now extend the symbols {a} and ¢, to be meaningful for all a € N. In 


all cases, the meaning is given by the right hand side of (2). 


Of course, if a ¢ ®, then (uwz)T (a, X,, z) * for all X,, since T™(a, X,, 2) 
will be false under the circumstances. Hence also {a}(x,) +, as it should be 
intuitively: In computer programmer’s jargon, “if the program a is syntactically 
incorrect, then it will not run, for it will not even compile. Thus, it will define 


the everywhere undefined function”. 


By the above, I.8.33 now is strengthened to read “thus, f € $8 iff for some 


aéN, f = {a}”. 


We can now define a SB-counterpart to K,, and BR, and consider its closure 


properties. 


Z 


<4 
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1.8.39 Definition (Semi-recursive Relations or Predicates). A relation P(x) 
is semi-recursive iff for some f € 5B, the equivalence 


P(x) <> fx) (1) 
holds (for all x, of course). Equivalently, we can state that P = dom(f). 
The set of all semi-recursive relations is denoted by $,.i 


If f = {a} in (1) above, then we say that a is a semi-recursive index of P. 


If P has one argument (i.e., P C N) and a is one of its semi-recursive 
indices, then we write P = W, (Rogers (1967)). 


We have at once 


1.8.40 Corollary (Normal Form Theorem for Semi-recursive Relations). 
P(X,) € DB, iff for some a € N, 
Pn) <> (Az)T (a, Xn, 2) 


Proof. By definition (and Theorem I.8.32 along with Remark I.8.38), P(x,) € 
6, iff, for some a € N, P(X,) — {a}(X,) |. Now invoke I.8.37(3). 


Rephrasing the above (hiding the “a”, and remembering that BR, C V,.) 
we have 


1.8.41 Corollary (Strong Projection Theorem). P(x,) € %. iff for some 
OGn, ze mR. 

P(Xn) <> (AZ)QQGn, 2) 
Proof. For the only-if part take O(X,, z) to be AX,z.T (a, X,, Z) for appropriate 


a €N. For the if part take f = AX,.(uz)Q(Xn, Z). Then f € % and P(x,) © 
FG). 


Here is a characterization of 98, that is identical in form to the characteriza- 
tions of BR, and Ki, (Lemma 1.8.20). 


1.8.42 Corollary. P(x,) € B. iff for some f € ¥, 
Pn) <> fn) = 0 


+ We are making this symbol up (it is not standard in the literature). We are motivated by comparing 
the contents of 1.8.20 and of 1.8.42 below. 
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Proof. Only-if part: Say P(X,) <> g(X,) |. Take f = AX,.0- 2X). 
If part: Let f = {a}. By 1.8.37(1), f (Xn) = 0 @ GzV(T(a, Xn, DA (Do = 
0). We are done by strong projection. 


The usual call-by-value semantics of f(g(X), y) require divergence if g(x) t. 
That is, before we embark on calculating the value of f(g(x), y), we require the 
values of all the inputs. In particular, 0- g(x,) + iff g(X,) +. 

Of course, “0 - g(x,,)” is convenient notation. If we set {b} = g, we can write 
instead 


(uzKT(b, Xn, (21) A @o = Oo 7 


We immediately obtain 


1.8.43 Corollary. %, C P,. 


Intuitively, for a predicate R € SR, we have an algorithm (that computes xp) 
that for any input X will halt and answer “yes” ( = 0) or “no” ( = 1) to the 
question “x € R?” 

For a predicate Q € SB, we are only guaranteed the existence of a weaker 
algorithm (for f € 8 such that dom(f) = Q). It will halt iff the answer to the 
question “x € Q?” is “yes” (and halting will amount to “yes”). If the answer 
is “no”, it will never tell, because it will (as we say for non-halting) “loop for 
ever” (or diverge). Hence the name “semi-recursive” for such predicates. er 


1.8.44 Theorem. R € K,. iff both R and >R are in Bx. 


Proof. Only-if part. By 1.8.43 and closure of , under —. 
If part. Let i and j be semi-recursive indices of R and —R respectively, that 
is, 
RGn) > (ATU, Xn, 2D 
ARGn) > ATC, Fn D 
Define 
B= iin (Uz(T OG, Xn, 2) V TG, Fn 2) 


Trivially, g € $8. Hence, g € K, since it is total (why?). We are done by noticing 
that R(X%,) <> TG, Xn, g(%n)). 


1.8.45 Definition (Unsolvable Problems; Halting Problem). A problem is a 


question “x € R?” for any predicate R. “The problem x € R is recursively 
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unsolvable’, or just “unsolvable”, means that R ¢ V,, that is, intuitively, there 
is no algorithmic solution to the problem. 


The halting problem has central significance in recursion theory. It is the 
question whether program x will ever halt if it starts computing on input x. 
That is, setting K = {x : {x}(x) |} we can then ask “x € K?”. This question is 
the halting problem.’ We denote the complement of K by K. We will refer to 
K as the halting set. 


1.8.46 Theorem (Unsolvability of the Halting Problem). The halting prob- 
lem is unsolvable. 


Proof. \t suffices to show that K is not semi-recursive. Suppose instead that i 
is a Semi-recursive index of the set. Thus, 


x EK @ (az)T (i, x, 2) 
or, making the part x € K — that is, {x}(x) + — explicit, 


3(4z)T (x, x, z) & €z)T YG, x, z) (1) 


Substituting i into x in (1), we get a contradiction. 


1.8.47 Remark. Let us look at the above in the light of W,-notation (p. 143). 
Now, {x}(x) ¢ iff x ¢ W,; thus we want to show 


(ai )(Wi = {x sx ¢ We}) (2) 
(2) says “{x : x € W,} is not a W;”. Well, if (2) is false, then, for some i, 
xEW, ox ¢ W, 
and hence 
iEeWw oi é W; 


—a contradiction. This is a classic application of Cantor diagonalization* and 
is formally the same argument as in Russell’s paradox, according to which 
{x : x ¢ x} is not a set —just omit the symbol “W” throughout. 

The analogy is more than morphological: Our argument shows that {x : x ¢ 
W,.} is not an object of the same type as the rightmost object in the { } brackets. 


1 “K” isa reasonably well-reserved symbol for the set {x : {x}(x) |}. Unfortunately, K is also 
used for the first projection of a pairing function, but the context easily decides which is which. 

= Cantor’s theorem showed that if (Xq)aey is a family of sets, then {a : a ¢ Xq} is not an “X;” — 
it is not in the family — for otherwise i € X; iffi ¢ X;. 
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Russell’s argument too shows that {x : x ¢ x} is not an object of the same 
type as the rightmost object in the { } brackets. That is, unlike x, it is not 


a set. © 


& K € ,, of course, since {x}(x) |< (Az)T (x, x, z). We conclude that the 
inclusion i, C %B, is proper, i.e., WR, C By. © 


1.8.48 Theorem (Closure Properties of ‘B,.). B,, is closed under V, A, (Ay) <:z, 
(Ay), (Vy) <;. It is not closed under either — or (Vy). 


Proof. We will rely on the normal form theorem for semi-recursive relations 
and the strong projection theorem. 

Given semi-recursive relations P(X,), O(¥m), and R(y, ux) of semi-recursive 
indices p,q, r respectively. 


Vv: 


Pn) V Om) > (Az)T™ (py Xn, 2) V AZT, Ys Z 
<> (Az)(T (p, Xn, 2) V TG, Ym, 2)) 


N™: 


P(Xn) A Qn) > (Az)T™(p, Kins ZA (az)T™(q, Ys Zz) 
<> (Aw)(Gz)ewT (Dp, Xn 2) A (AzyewT (Gs Yms 2) 


& Breaking the pattern established by the proof for V, we may suggest a simpler 


proof for A: Pn) A QGim) <> (uz)T(p, Xn, 2) + (uz)T PG, Ym, 2)) Y- 
Yet another proof, involving the decoding function Aiz.(z); is 


P(Xn) A Om) <> (ZIT (p, Xn, 2) A Az)T(G, Ys Z) 
<> (Az)(T (Dp, Xn, (20) A T°, Ym (2)1)) 


There is a technical reason (to manifest itself in II.4.6) that we want to avoid 
“complicated” functions like Aiz.(z); in the proof. er 


(Ay)<z: 


(Ay) -.R(y, uk) > Ay)-,Aw)T’*(r, y, ty, w) 
< (Aw)(y)-,.T 8 (r, y, tg, w) 


(Ay): 


(Ay) R(y, uy) & Ay)(Aw) TYG, y, tg, w) 
<> (Az)\Gy)<(Aw) eT Mr, y, tte, w) 
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Seon of the 4-cases can be handled by the decoding function Aiz.(z);. For 
example, 
(Ay)R(y, x) > (Ay)(Aw)TOM(r, y, te, w) 
<> (Az) THY, (Zo, He, (21) © 


(Vy)ez: 


(Wy)<R(y, tx) > Wy)<(Aw)T“(r, y, te, w) 
<> (Av)\(Vy)<,Aw)ayT (7, y, tg, w) 


Think of v above as the successor (+1) of the maximum of some set of w- 

Ole Wo, +++, Wz-1, that “work” for y = 0,..., z— 1 respectively. The usual 
overkill proof of the above involves (z); (or some such decoding scheme) as 
follows: 


(Vy)<cR(y, Uk) > Wy)<,Aw)Te(r, y, te, w) 
 GwyVy).T Mr, y, tk, (W)y) © 


Regarding closure under — and Vy, K provides a counterexample to —, and 


aT (x, x, y) provides a counterexample to Vy. 


or Remark (Projection Theorem). That ‘B, is closed under (Ay) is the 
content of the (weak) projection theorem. © 


1.8.50 Definition (Recursively Enumerable Predicates). A predicate R(x,) 
is recursively enumerable (r.e.) iff R = Y or, for some f €% of one variable, 
R = {(%,): (Gm) f (m) = (X,)},! or, equivalently, 


RGn) > (Am) f(m) = (Xn) () 


By (1) and strong projection (1.8.41), every re. relation is semi-recursive. 
The converse is also true. 


1.8.51 Theorem. Every semi-recursive R is re. 


Proof. Let a be a semi-recursive index of R. If R = J, then we are done. 
Suppose then R(a,) for some a,,. We define a function f by cases: 


((m)o,- +++ (m)n—1) if T(a, (m)o, ..., )n—1, (™)n) 
(An) otherwise 


fom) = | 


It is trivial that f is recursive and satisfies (1) above. Indeed, our f is 
in BR. 


¥ Cf. comment regarding rare use of round brackets for n-tuples, p. 136. 
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Suppose that 7 codes a program that acts on input variables x and y to 
compute a function Axy. f(x, y). It is certainly trivial to modify the program 
to compute Ax. f(x, a) instead: In computer programming terms, we simply 
replace an instruction such as “read y” by one that says “y := a” (copy the 
value of a into y). From the original code, a new code (depending on i and a) 
ought to be trivially calculable. 

This is the essence of Kleene’s iteration or S-m-n theorem below. 


1.8.52 Theorem (Kleene’s Iteration or S-m-n Theorem). There is a primitive 
recursive function hx y.o(x, y) such that for alli, x, y, 


{}Cx, y)) = to, y)}@) 


Proof. Let a be a g-index of Ax.(x, 0), and b a g-index of Ax.3x. Next we find 
a primitive recursive Ay.h(y) such that for all x and y 


{h(y)}Q@) = (x, y) (*) 
To achieve this observe that 
(x, 0) = {a}(x) 
and 
(x, y + 1) = 3(x, y) = {b}Cx, y) 
Thus, it suffices to take 


h(0O)=a 
h(y + 1) = (1, 1, b, h(y)) 


Now that we have an h satisfying (+), we note that 


oli, y) = (1, 1,1, h(y)) 


will do. 
1.8.53 Corollary. There is a primitive recursive function hiy.k(i, y) such that, 
for alli, x, y, 

{i}, y) = {kG, y)}@) 
Proof. Let ap and a, be ¢-indices of Az.(z)o and Az.(z), respectively. Then 


{1}(Z)o, 1) = {(1, 1, 4, ao, a1) }(Z) for all z, i. Take k@, y) = o((1, 1, 7, ao, 
a), y). 
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1.8.54 Corollary. There is for each m > 0 andn > 0 a primitive recursive 
function ri ¥_S™ (i, Yn) such that, for alli, Xm, Yn, 


{i} Gems Yn) = {Si G, Yn)} Gm) 


Proof. Leta, (r = 0,...,m—1)andb, (r = 0,...,— 1) be ¢-indices so that 
{a-} = Axy.(x), ¢ =0,...,m — 1) and {b,} = Axy.Cy), (¢¥ =0,...,2— 1). 

Set c(i) = (1, 2,7, do, ..., Gm—1, bo, ..-, bn_1), for alli € N, and let d be a 
o-index of AX.(Xm). Then, 


{}GQins Yn) = {c)}(Xm), (Yn)) 
= {k(c@), (Yn))}(%m)) by 1.8.53 
= {(1,m, k(c@), (Yn)), €)}Gm) 


Take Ai}n 5” = (1,m, k(c(i), (Yn), a). 


Since $8-functions are closed under permutation of variables, there is no signi- 
ficance (other than notational convenience, and a random left vs. right choice) 
in presenting the S-m-n theorem in terms of a “neat” left-right partition of the 
variable list. Any variable sub-list can be parametrized. er 


1.8.55 Corollary (Kleene’s Recursion Theorem). /f Azx. f(z, X,) € 3B, then 
for some e, 


{e}(Xn) = f(e, Xn) for all Xp 
Proof. Let {a} = AzXn.f (S1(Z, 2), Xn). Then 


f (ST, a), Xn) = {a}(a, Xn) 
= {s"(a,a)}@n) by 8.54 


Take e = Si(a, a). 


1.8.56 Definition. A complete index set is a set A = {x : {x} € Q} for some 
nNcP. 

A is trivial iff A = @ or A = N (correspondingly, 9 = 4 or Q = $B). 
Otherwise it is non-trivial. 


1.8.57 Theorem (Rice). A complete index set is recursive iff it is trivial. 


© Thus, algorithmically we can only decide trivial properties of programs. er 
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Proof. (The idea of this proof is attributed in Rogers (1967) to G. C. Wolpin.) 


If part: Immediate, since xg = Ax.1 and yy = Ax.0. 
Only-if part: By contradiction, suppose that A = {x : {x} € Q} is non- 
trivial, yet A € K,. So leta € A and b ¢ A. Define f by 


b ifxeA 
fay= | ifx dé A 


Clearly, 
x eA iff f(x) ZA, for all x (1) 


By the recursion theorem, there is an e such that { f(e)} = {e} (apply 1.8.55 to 


Axy{f()}(y)). 
Thus, e € A iff f(e) € A, contradicting (1). 


A few more applications of the recursion theorem will be found in the 
Exercises. 


1.8.58 Example. Every function of ‘8 has infinitely many indices (revisited). 
For suppose not, and let A = {x : {x} € Q} be finite, where Q = { f}. Then A 
is recursive (why?), contradicting Rice’s theorem. 


We have seen that progressing along BR,,, K,, 9B, we obtain strictly more 
inclusive sets of relations, or, intuitively, progressively more “complex” predi- 
cates. For example, we can easily “solve” Axy.x < y, we can only “half” solve 
x € K, and we cannot even do that for x ¢ K. The latter is beyond ,.. Still, 
+ or “arithmetical” predicates in a sense that we make 
precise below. The interest immediately arises to classify arithmetic predicates 
according to increasing “complexity”. This leads to the arithmetic(al) hierarchy 


of Kleene and Mostowski. 


all three are “arithmetic 


1.8.59 Definition. The set of all arithmetic(al)! predicates is the least set 
that includes , and is closed under (4x) and (Vx). We will denote this set 
by A. 


+ Accent on “met”. 

= We will adhere to the term “arithmetic”, as in Smullyan (1992). The reader will note that these 
predicates are introduced in two different ways in Smullyan (1992), each different from the 
above. One is indicated by a capital “A” and the other by a lowercase “a”. All three definitions are 
equivalent. We follow the standard definition given in works in recursion theory (Rogers (1967), 
Hinman (1978), Tourlakis (1984)). 
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We sort arithmetic relations into a hierarchy as follows (Kleene (1943), 
Mostowski (1947)). 


1.8.60 Definition (The Arithmetic Hierarchy). We define by induction on 
neéeN: 


Xo = No = Ry 
Lap = {x)P : P € Ty} 
Tdi = {((Vx)P :P © Xn} 


The variable “x” above is generic. 


We also define, for alln, A, = X, OT], and also set AK = Un>o(2n UTI, ). 


1.8.61 Remark. Intuitively, the arithmetic hierarchy is composed of all relations 

2 oe the form (Q1x;)(Q2%2)...(Qnx,)R, where R € KR, and Q; € {3, V} for 
i=1,...,n.Ifn = 0 there is no quantifier prefix. Since xdy and VxVy can 
be “collapsed” into a single J and single V respectively, 


Pause. Do you believe this? 


one can think of the prefix as a sequence of alternating quantifiers. The relation 
is placed in a I1-set (respectively, X-set) iff the leftmost quantifier is a “V” 


(respectively, “J”’). 


1.8.62 Lemma. R € &, iff(-—R) € Tl, R € My iff (AR) € Xp. 


Proof. We handle both equivalences simultaneously. For n = 0 this is so by 
closure properties of R,. 
Assuming the claim for n, we have 


R € Da iff R @ (Ay)Q and O € 1, 
iff (LH.)R = 7(Vy)-@ and (4Q) € X, 
iff —R = (Vy)7@ and (=Q) € =, 
iff -R € Mp4 


and 
R € M,41 iff Ro (Vy)Q and Q € &, 
iff LH.) R = —(@y)-@ and (-Q) € Tl, 


iff -R = (Ay)-@ and (-Q) € M1, 
iff —R € Last 
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It is trivial that Ay C A. The following easy lemma yields the converse 
inclusion as a corollary. 


1.8.63 Lemma. We have the following closure properties: 


Aw _ is closed under (1)-(6) from the following list. 

Ay (n => 0) is closed under (1)-(4). 

Xn (n > 0) is closed under (1)—(3), and, if n > 0, also (5). 
II, (2 => 0) is closed under (1)-(3), and, if n > 0, also (6). 


(1) Replacement of a variable by a recursive function‘ 
(2) V,A 

(3) Gy)<z, Vy )az 

ys 

(5) Gy) 

(6) (Vy) 


Proof. (1) follows from the corresponding closure of ,. The rest follow at 
once from I.8.60 and the techniques of I.8.48 ((4) also uses 1.8.62). 


1.8.64 Corollary. A = A. Hence, Definition 1.8.60 classifies all arithmetic 
predicates according to definitional complexity. 


We next see that the complexity of arithmetic predicates increases in more 
than just “form” as 7 increases (i.e., as the form “(Q1x1)(Q2x2)...(QnxXn)R” 
gets more complex with a longer alternating quantifier prefix, we do get new 
predicates defined). 


1.8.65 Proposition. ©, UT], C A,+; forn = 0. 


Proof. Induction on n. 
Basis. Forn = 0, ©, UM, = R,. On the other hand, ©) = 8, (why?), while 
(by 1.8.62) 1) = {2 : (-Q) € B,}. Thus Ay = DY, NTT, = KR, by 1.8.44. 
We consider now the n + 1 case (under the obvious I.H.). Let R(X,) € E_41- 
Then, by I.8.63(1), so is Azx,.R(X,), where z is not among the x,. Now, R 
(Vz)R, but (Vz)R € yao. 


7 Since u?, Ax.0, and Ax.x + 1 are recursive, this allows the full range of the Grzegorczyk substi- 
tutions (Exercise 1.68), i.e., additionally to function substitution, also expansion of the variable 
list, permutation of the variable list, identification of variables, and substitution of constants into 
variables. 
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Next, let R(X,) < (Az)Q(z, X,), where Q € T,. By the LH., O € Ajit; 
hence Q € T,,4;. Thus, R € Upto. 

The argument is similar if we start the previous sentence with “Let R(x,) € 
M41”. 


1.8.66 Corollary. ©, © 2,4) and M1, © MWnai, forn = 0. 
1.8.67 Corollary. A, © A,41 forn = 0. 


We next sharpen the inclusions above to proper inclusions. 


1.8.68 Definition (Kleene). For > 1 we define Axy.E,(x, y) by induction: 


E\(x, y) @ ATC, y, 2) 
En4i(x, y) <> (AZ) AEX, (z) * y) 


where “( )” is our standard coding of p. 136. 


Of course, (z) = 2*+!. 


1.8.69 Lemma. E,, € X, and -E, € My, forn > 1. 


Proof. A trivial induction (via I.8.62 and 1.8.63). 


1.8.70 Theorem (Enumeration or Indexing Theorem (Kleene)). 


(1) R(x,) € Lat iff R(x;) Oo EnauiG, (x,)) for some 1. 
(2) R(x;) € Wn41 iff R(x;) o Ens, (x) for some 1. 


Proof. The if part is Lemma 1.8.69 (with the help of I.8.63(1)). We prove the 
only-if part ((1) and (2) simultaneously) by induction on n. 

Basis.n = 0. If R(x,) € X, = Px, then so is R((w)o, ..., (4)-—1). Thus, for 
some i (semi-recursive index), R((w)o,...,(u)-—1) <> (Az) TG, u, z); hence 
R(x) <> (Az) TG, (X,), 2) > Ex(i, (X;)). 

If R(,) € M1, then (-R(x,)) € X; = Y,. Thus, for some e, ~R(X,) > 
Ei (e, (X,)); hence R(X,) <> 7Ej(e, (X,)). 


The induction step. (Based on the obvious I.H.) Let R(x,;) € Un+2. Then 
R(x,) <> (Az) Q(z, x,), where QO € T1,41, hence (I.H.) 


O(z, X-) <> En+i(e, (Z,%r)) for some e (i) 
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Since (z, x.) = (z) * (x,), (i) yields 


R(X;) <> (Az) 7 Enyi(e, (Zz) * (X;)) 
<> En42(e, (x,)) (by the definition of E,,42) 


An entirely analogous argument takes care of “Let R(x,) € T,42”. 


1.8.71 Corollary. The same set of arithmetic relations, A, can be defined by 
setting Xp = Mo = PBR,,. Indeed, no sets in the hierarchy are affected by this 
change, except Xo = To. 


1.8.72 Theorem (Hierarchy Theorem (Kleene, Mostowski)). 
C1) Yna1 — Mn 4G, 
(2) Mn+ — Engi FO. 


Moreover, all the inclusions in 1.8.65 (n > 0), 1.8.66 (n > 0), and 1.8.67 (n > 0) 
are proper. 


Proof. (1): Enzi € Enzi — Wn41. Indeed (see also 1.8.69), if E,+1 € Wn41, 
then 


En+i(x, (x)) <> TEnsiG, (x)) ee) 


for some i (1.8.63(1) and, 1.8.70). Letting x = i in (1’), we get a contradiction. 
(2): As in (1), but use —E,,4; as the counterexample. 


For 1.8.65: Let R(x, y) @ E,(x, (x)) V AEn(y, (y)). Since E, € X, and 
(-E,) € Wy, 1.8.63(1), 1.8.65, and closure of A,+; under Vv yield R € A,41. 
Let xo and yo be such that E,,(xo, (xo)) is false and E,,(yo, (yo)) is true Gf no 
such x9, yo exist, then E,, is recursive (why?); not so for n > 0 (why?)). 

Now, R ¢ X, UTIy, for, otherwise, say it is in X,. Then so is R(xo, y) (that 
is, 7E,,(y, (y))), which is absurd. Similarly, if R € T],, we are led to the absurd 
conclusion that R(x, yo) € Wy. 


For 1.8.66: K € X%; — Xo. Forn > 0, XU, = Uy41 implies WN, = MWy41 
(why?); hence X, UM, = Yaar U Mag D Uay1 A Wng1 = Anyi; thus 
xX, UO, = Any by 18.65, contrary to what we have just established above. 
Similarly for the dual IT, vs. 1,41. 

For 1.8.67: If A,41 = An, then Ang, = YU, 0, C Uy, U Oy, which has 
been established as an absurdity for n > 0. 
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In this section we apply recursion-theoretic techniques to the proof theory 
of a certain axiomatic arithmetic in order to derive a major result (Gédel’s 
first incompleteness theorem) regarding the inadequacy of the syntactic proof 
apparatus. 

We have to overcome a small obstacle outright: Our recursion theory is 
a theory of number-theoretic functions and relations. We need some way to 
translate its results in the realm of strings (over some alphabet) so that our theory 
can handle recursive, r.e., primitive recursive, etc., functions and predicates 
that have inputs and outputs that are such strings. For example, a recursively 
axiomatized theory, we say, is one with a recursive set of axioms. But what do 
we mean by a recursive set of strings? 

Well, we can code strings by numbers, and then use the numbers as proxies 
for the strings. This is the essence of Gédel numbering, invented by Gédel 
(1931) towards that end. 


Given a finite alphabet 7”. We denote by 7™ the set of all strings over 7. 
We use the term “G6édel-numbering” for any 1-1 (not necessarily onto) function 
f :Y* — N that is intuitively computable. More precisely, we want to have 
two algorithms: one to compute f(w) for each w € #™*, and one to check if a 
given n € N codes some string over 7 (i.e., is in the range of f), and if so, to 
compute w = f—!(n). 

We use special notation in the context of Gddel numberings. Rather than 
“ f(w)” — the Godel number of w — we write w', allowing the context to tell 
which specific f we have in mind. 

The above, of course, is not a precise definition, since the term “computable” 
(function) was never defined between “mixed” fields (7”* on the left and N on 
the right). This does not present a problem however, for in practice any specific 
Godel numbering will be trivially seen (at the intuitive level) to satisfy the 
algorithmic coding-decoding requirement. S 


The heart of the matter in effecting a Godel numbering of a given 7™* is 
simple: We start with a “primitive” numbering by assigning distinct numbers 
from N to each symbol of 7”. Then, in principle, we can extend this primitive 
numbering to the entire 7™* by recursion on strings. 


© However, in the cases of interest to us, that is, terms, formulas, and sequences 
of such over some language L, we will prefer to do our recursion on terms, 
formulas, and sequences (rather than on generic strings). cr 
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For example, if the numbers no, 71, ..., 2m Were already assigned to the 


formal function symbol g (m-ary) and to the terms f), ... , t respectively, one 


would just need a way to obtain a number for gt, ...t,, from the numbers 


we have just listed. This simply requires the presence of some number-coding 


scheme, “()”, to compute (10, 11, ..., 1m). Gddel used, essentially, the prime 
power coding scheme “(), /h, Seq, (z);” of p. 136. 


We will not adopt prime power coding here, but instead we will use a different 


coding based on Gédel’s B-function. Our approach is based on the exposition in 
Shoenfield (1967), the motivation being to do the number coding with as little 
number theory as possible,’ so that when we have to revisit all this in the next 
chapter — formally this time — our task will be reasonably easy, and short. 


We fix, for the balance of the chapter, a simple pairing function, that is, 


a total and 1-1 function J : N2 — N that can be obtained via addition and 
multiplication. For example, let us set 


I(x, y= (ty +x for all x, y (+) 


J is trivially primitive recursive. Its 1-1-ness follows from the observation 


thatx+y <u+vimpliesx+y+1 < u+vandhence J(x, y) < (x+y+1)* < 
(u+v) < J(u, v). 


Thus, if J(x, y) = J(u, v), then it must be x + y = u + v, and therefore 


x =u. But then y = v. 


The symbols K and L are reserved for the projection functions of a pairing 


function J.? 


We see at once that K, L are in BR; indeed, 


K(z) = (Ux)<-(Ay)<(J (x, y) = 2) 
L(z) = (Hy )<:(Ax)<(J(, y) = Zz) 


S Let us then address the task of finding a way to “algorithmically” code a number 
sequence ag, ..., a, by a single number, so that each of the a; can be algorith- 


mically recovered from the code as long as we know its position i. 


t 


cs 


We start by setting 


c =max{l+ J(i,a;):i <n} (>) 


Gédel’s original approach needed enough number theory to include the prime factorization 
theorem. An alternative approach due to Quine, utilized in Smullyan (1961, 1992), requires the 
theorem that every number can be uniquely written as a string of base-b digits out of some fixed 
alphabet {0, 1, ..., b— 1}. On the other hand, employing the variant of Godel’s 6-function found 
in Shoenfield (1967), one does not even need the Chinese remainder theorem. 

It is unfortunate, but a standard convention, that this “K” clashes with the usage of “K” as the 
name of the halting set (1.8.45). Fortunately, it is very unlikely that the context will allow the 
confusion of the two notations. 
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Next, let p be the Jeast common multiple of the numbers 1, 2,...,¢c + 1, for 
short, Jem{1, 2,...,c+ 1}, that is, 


p= (u2)(z > 0A Wiecli + Diz) (4408) 
We recall the following definition from elementary number theory: 
1.9.1 Definition. The greatest common divisor, or gcd, of two natural numbers 


a and b such that a + b # 0 is the largest d such that d|a and d|b.' We write 
d = gcd(a, b) = gcd(b, a). 


1.9.2 Lemma. For each a and b in N (a + b  O) there are integers u and v 
such that gcd(a, b) = au + bv. 


Proof. The set A = {ax +by >0:x € ZA y € Z} is nonempty. For example, 
ifa £0, thena1l+b0 > Ois in A. Let g be the least member of A. So, for some 
u and v in Z, 


g=aut+bv>0 (1) 
We argue that g|a (and similarly, g|b). If not, let 
O0O<r<g (2) 


such thata = gq+r(q = |a/g]). Thus, using (1),7r = a— gq = a(1—uq)— 
bvq, a member of A. (2) contradicts the minimality of g. 

Thus, g is a common divisor of a and b. On the other hand, by (1), every 
common divisor d of a and b divides g; hence d < g. Thus, g = gcd(a, b). 


1.9.3 Definition. Two natural numbers a and b (a+ b ¥ 0) are relatively prime 
iff gcd(a, b) = 1. 


The above is the standard definition of relative primality. However, an equi- 
valent definition can save us from a lot of grief when we redo all this formally 
in the next chapter. This alternate definition is furnished by the statement of 
the following lemma, which we prove here (informally) only as a preview for 
things to come in the next chapter. 


1.9.4 Lemma. The natural numbers a and b (a + b # 0) are relatively prime 
iff for all x > 0, a|bx implies a|x. 


i Ifa =b = 0, then sucha largest d does not exist, since (Vx)(x > 0 > x|0). Hence the restriction 
toa+b40. 
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Proof. Only-if part. By 1.9.2, 1 = au + bv for some integers u and v. Thus, 
x =axu + bxv; hence alx. 

If part. We prove the contrapositive.' Let 1 <d= gcd(a, b). Writea = dk 
and b = dm. Now, a|bk, buta > k; hencea /{k. 


We can now prove: 


1.9.5 Lemma. With c > 0 arbitrary, and p chosen as in ( **) on p. 156 above, 


gcd(1+(j+1)p,1+G@+Dp)=1 forall O<i<j<ct+l 


Proof. We will proceed by contradiction. Let d > | divide both 1+ (j + 1)p 
and 1+(i+1)p.Thend > c+ 1 (otherwise d|p andhenced {1+(j+1)p). 

Now d divides any linear combination of 1+ (j + 1)p and1+ (i+ 1)p,in 
particular 7 —i (which is equal to(1+G@+)Dp)G+D-d+QG+)Dp)G@+)). 
Since 0 < j —i <c +1, this is impossible. 


1.9.6 Lemma. /fdo, ..., d;, b, with b greater than 1, are such that gcd(b, d;) = 
1 forall0 <i <r, thenb {Ilcm{d; :i =0,...,r}. 


Proof. Set z = Icm{d; :i = 0,...,7r}, and suppose that b|z; thus z = bk for 
some k. By 1.9.4 (since d;|z), we have d;|k for all0 <i <r. Fromb > 1 
follows k < z, contradicting the minimality of z. 


Armed with Lemmata 1.9.5 and I.9.6, we can now code any nonempty subset 
(the emphasis indicating disrespect for order) of numbers i;,0 <i; < c(c > 0 
being an arbitrary integer), by the least common multiple g of the numbers 
1+ (i; + 1)p, p being the /em of {1,2,...,c + 1}. Indeed, a number of the 
form 1+ (z+ 1)p, for 0 < z < ¢, divides q precisely when z is one of the i;. 

Thus, we can code the sequence do, a1, ..., d, by coding the set of numbers 
J (i, a;) (position information i is packed along with value aj), as suggested 
immediately above: We let 


qd =lem|1 + (1+ JG,a))p tT =O a 
With n, c, p, g as above (see («*) and (s«)), we recover a; as 


(uy)(1+ (1+ JG y))P la) (1) 


+ The contrapositive of .4 > .7 is the tautologically equivalent 7.7 > —.Z. 
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or, setting, a = J(c, q) and forcing the search to exit gracefully even when q 
is nonsense (that is, not a code), 


(uy)(y =av 1+ (1+JG»)PIL@) 2) 
Of course, p in (2) above is given in terms of c by (**), that is, it equals the 
least common multiple of the numbers {1,2,..., K(a) + 1}. 
In what follows we will use the notation 6 (a, i) for the expression (2) above. 
Thus, 
Aai.B(a, i) € BR 
B(a,i) <a for all a, i (3) 
The “<” in (3) is “=” iff the search in (1) above fails. Otherwise y < J(i, y) < 
14+ JG, y) < L(a) < a+ 1; hence y < JG, y) <a. 


For any sequence do, ..., @, there is an a such that B(a,i) = a; fori <n. 


This is the version of Gédel’s $-function in Shoenfield (1967). Since we will 
carefully prove all the properties of this 6-function — and of its associated 
sequence coding scheme — later on, in the context of axiomatic arithmetic 
(Section II.2), we will avoid “obvious” details here. cr 


1.9.7 Remark. (1) Why /cm? Both in («-) and in the definition of g above, we 
OS could have used product instead. Thus, p could have been given as c-factorial 
(c!), and q as the producti 


[1 {i+ G4 JGa0)p:1=0,...,2] 


The seemingly more complex approach via /cm is actually formally simpler. 
The /cm is explicitly definable, while the product of a variable number of factors 
necessitates a (primitive) recursive definition. Alas, 6 will be employed in the 
next chapter precisely in order to show that recursive definitions are allowed in 
Peano arithmetic. 


(2) Gédel’s original 6 was less “elementary” than the above: Starting with 
a modified “‘c’’, 


c’ =max{l+a;:i <n} («*’) 
one then defines 
Pp Sale (9%) 


As in Lemma I.9.5, ged(1+(1+ j)p’, 1+ +i)p’) = 1forallO <j <i<n. 


7 As a matter of fact, for a set of relatively prime numbers, their /cm is the same as their product. 
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By the Chinese remainder theorem (see, e.g., LeVeque (1956)), there is a 
q’ € N such that gq’ = a; mod 1+(1+/i)p’ fori = 0,...,n. Thus, q’ codes 
the sequence ag, ..., Ay. Since each a; satisfies aj < 1+ (1 +i)p’ by the 
choice of p’, a; is recovered from q’ by rem(q’, 1 + (1 + i)p’). One then sets 
B'(q', p’, i) = rem(q’, 1+. +i)p’). This “B’” is Gédel’s original “B”. 


This ability to code sequences without using (primitive) recursion can be used 
to sharpen Corollary 1.8.71, so that we can start the arithmetic hierarchy with 
even simpler relations Xo = No than primitive recursive. 

We will profit by a brief digression here from our discussion of Gédel num- 
berings to show that any R € ‘BR, is a (Ax)-projection of a very “simple” 
relation, one out of the set defined below. This result will be utilized in our 
study of (that is, in the metatheory of) formal arithmetic. 


1.9.8 Definition (Constructive Arithmetic Predicates). (Smullyan (1961, 
1992), Bennett (1962)). The constructive arithmetic predicates are the smallest 
set of predicates that includes Axyz.z = x + y, Axyz.z = x - y, and is closed 
under —, V, (Gy)—,, and explicit transformations. 


Explicit transformations (Smullyan (1961), Bennett (1962)) are exactly the 
following: Substitution of any constant into a variable, expansion of the vari- 
ables list, permutation of variables, identification of variables. 


We will denote the set of constructive arithmetic predicates by €2L./ 


Trivially, 


1.9.9 Lemma. €2 C PR,,. 


and 


1.9.10 Lemma. €2l is closed under (Ay)<,. Conversely, the definition above 
could have been given in terms of (Ay)<;. 


Proof. (Ay)<:R(y, W) = R(z, W) Vv (Ay)<zR(, w). Conversely, (Ay), R(y, 
W) > (Ay)<,(4y = z A RG, w)). Of course, y = z is an explicit transform of 
x y=Z, 


¥ Smullyan (1992) uses “Xp”, but as we already have been shifting the meaning of that symbol, this 
name would not be reliable here. Some use “Ao” (due to the presence of bounded quantification), 
but this symbol also suffers from the same ambiguities. 


ee? 
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1.9.11 Lemma. €2( is closed under substitution of the term x +1 into variables. 


Proof. We do induction on the definition of €2{. 


Basis. 


GQ) z=xtytl: z=xty4t1 so GAwe(g=x+wAw=yt1). Of 
course, w = y + 1 is an explicit transform of w = y + u. 

2Q)z+tl=xty: ztl=xt+yo Gwee =wt+tlaz=we+y)v 
Gwjy<a(y=wt+tlaz=x+uw). 

B)z=xyt): cz=xt)Do Gv)j<z=xwAw=y-t I). 

(4)z+l=xy: z+l=xy<o Guj<,Aw<eya =ut+tlAy=wt+laz= 
uw+u-+w). Of course, z = uwtutw  (ax)<,(dya(Z =x+yAx = 
uwAy=u+w). 


The property of €2{ that we are proving trivially propagates with the Boolean 
operations. Moreover (Ay) <2; R(y, X) > (Ay)<,R(, X). 


(Ay)<, is a “derived operation” (1.9.10); thus it is not checked in the proof 
above. Needless to say, were it the principal (primary) bounded quantification 
definitionally, then we would have employed the argument (Ay)<,4; R(y, x) © 
R(z + 1,x) Vv (Ay)<,R(y, Xx) instead, and the LH., to conclude the above 


proof. cr 


1.9.12 Corollary. €21 is closed under substitution into variables by functions 
f satisfying 

(i) AyXn.y = f (Xn) is in CA, and 

(ii) for some i, f (Xn) < x; +1 for all Xp. 


Proof. Let R(y, Z) be in €2, and f be as described above; in particular let 
f (Xn) < x; +1 for all x,. Then 


RF &n), 2) > Ay)ex 419 = fn) A RO, 2) 


1.9.13 Lemma. The graphs of J, K, and L (p. 156) are in €2l. 
S By the “graph” of a function Ax. f(x) we understand the relation Ayx.y = f(x). © 


Proof. The case for J is trivial. K and L present some mild interest. For 
example, 


y=KWS (y =x4+1A-AveJ(y,2 = x) V Ader Ii, 2 =x 
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1.9.14 Remark. We can substitute K (x) and L(x) for variables in €2{ relations, 
by 1.9.12, since K(x) < x + 1 for all x and L(x) < x + 1 for all x. J(x, y) 
is “too big” for I.9.12 to apply to it, but that too can be substituted, as results 
of Bennett show (1962, retold in Tourlakis (1984)). However, we will not need 
this fact in this volume. 


1.9.15 Proposition. Aaiy.B(a, i) = y is in Cl. 
Proof. Recall that B(a, i) = (uy)(y =aVv1l+U+4+ JG, y)) p(@) | L(a@)), where 

P(A) = (U2)Z > OA Wiexai + Dz) 
We set for convenience P(a,i, y) ~ 1+ (14+ J, y))p(a)| L(@), and show 
that P is in €21. Now P(a, i, y) > (du)<q(L(a) = ul + 1 + JG, y))p@)), 
and in view of 1.9.12 (cf. 1.9.14) and closure under (Ay)<,,, we need only show 
that 

z= u(1 + 4+ JG, y))p(a)) 

is in €2. The above is equivalent to 


(av, ne(v = pla)Az=uxAx=14(04- JG, y)v) 


where we have used the shorthand “(Av, x)<,” for “(dv)<,(4x)<,”. 


That x = 1+ (1+ JG, y))v is in €2 follows from 


Xx 


14+(4+ JG, y)v @ (du, w,r, s)<x(w = J(i, y) Au 
=wtlAr=uvAx=rt+)) 


We concentrate now on v = p(a). We start with the predicate 
A(z,w)?z>0A VidewGt+ 1D] z 
This is in €2l since (i + 1)|z <> Gu)<,z = u(i + 1). Now 
v= p(@) > Av, K(a)) A WZ)<v7 HZ, K(@)) 


and the case for v = p(a) rests by 1.9.14. This concludes the proof that P(a, i, y) 
is in C2. Finally, setting 


Q(a,i,y)<~ y=av P(a,i, y) 
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we have Q(a, i, y) in €2l. We are done by observing that 


Bt) =y > OG,1, y) A V2)<y7 OG, 4, y) 


The crucial step in order to achieve the earlier suggested projection represent- 
ation of arbitrary primitive recursion relations (p. 160) is to express the arbitrary 
primitive recursive “computation” 


(2005-1. )(2 =e: AGO = AGA Wiarcisi = 83.6) — A) 
in the form (du)R, where R € €2. (1), of course, “computes” (or verifies) that 
z= f(x, y), where, for all x, y, 


FO, y) = h(y) 
Ff +1,y) = 8@, ¥, fy) 


1.9.16 Theorem. Jf R(x) € BR, then, for some O(y, X) € EA, 
R(x) = (y) QQ, x) 


Proof. Since R(x) € BR, iff for some f € BR one has R(X) o f(x) = 0, 
it suffices to show that for all f € BR there is a Q(z, y, x) € EA such that 
y= f@) > €z)Q, y, x). 

We do induction on $8. The graphs of the initial functions are in €2( without 
the help of any (4y)-projection. 

Let f(x)=g(h\(X),...,4m(X)), and assume that (LH.) z=2(m) 
(Au)G(u, Z, ¥m) and y=hj(x) <> (Az) Aj(z, y, xX) fori=1,...,m, where G 
and the H; are in €2. 

Let us write “(4z,)” and “(4Z,)<,” as short for “(Az1)(dz2)...(dz,)” and 
“(z1)<w(dzZ2)<w ..-(dz-)<w” respectively. Then 


Y= f@) > Glin)(¥ = 8Gin) A uy = hy) A+++ A ttn = lin )) 
o iin) (GG, y,Um) A (Az) Ai (z1, U1, X) A 
++ (Bz) Hn ns Uns 2) 
© (2w)(iim en (G2)<w Ge, Ys fim) A Ber ew Hier, m1, 8) A 
+A @zn)cw Hm ns tins 2) 


We finally turn to the graph z= f(x, y), where f is defined by primitive recur- 
sion from h and g above. This graph is verified by the computation (1). 
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Assume that (this is the LH.) z = A(y) ©— (dw)H(w,z, y) and u = 
e(x, y, Z) > Gw)G(w, u, x, y, z), where H and G are in €2. Then 
z= f, 3) > Ge)(Ble, 0) = A(y) A Ble, x) =z A 


(Vi) <xB(ci +1) = gli, ¥, Ble, i) 
< (using the LH.) 
Ge) (Gw) Hw, B(c,0),5) A Ble.) = 2A 
(Vi) <x(3w)G(w, B(c, i +1), 1.5, BCC.) 
< (see the proof of 1.8.48) 
(2u)(Bo) <u (Ew) <uH(w, Ble, 0), 3) A Ble, x) = 2A 


(Vi) <x(3w) <uG(w, lc, i + 1),1, 5, BCe,i))) 


Of course things like G(w, B(c,i +1), i, y, B(c, i)) can be easily seen to be in 
€2A, since y = B(c, i) is by 1.9.15, and B(c,i) <c <u. 


1.9.17 Corollary. A is the closure ofz = x+y andz = xy underV, 7, (Ay) <z, 
(Ay), and explicit transformations. 


Indeed, since (Ay) subsumes (Jy) <,, 
1.9.18 Corollary. A is the closure of z = x + y and z = xy under V, 7, (Ay), 
and explicit transformations. 


The name “arithmetic” relations is now completely justified. 
The following corollary is sufficiently important (even useful) to merit theo- 
rem status. 


1.9.19 Theorem (A Constructive Arithmetic Kleene Predicate). There is, 
for eachn > 0, a constructive arithmetic predicate L0G Xn» Z) such that 


(Gn) > @)TCRG, Ens 2) 
Moreover, there is a primitive recursive function U of one argument, such that 
fie U((HaTEXG, Xp, 2) 
Proof. By 1.9.16, let, for every n > 0, C™ € €2 be such that 
TG, Xn, 2) <> (W)C, i, Xn, 2) 


Set TE? (i, Xn, 2) & Au)e,Aw)e(z = Ju, w) A Cu, i, Xn, w)), in other 
words, TG. Xn, Z) <> C™(K(z), i, Xn, L()). 
For U, set, for all z, U(z) = (L(z))o. 


1.9. Arithmetic, Definability, Undefinability, and Incompletableness 165 


After our brief digression to obtain I.9.16 and its corollaries, we now return 
to codings and Gédel numberings. 


We have seen that any (fixed length) sequence do, ..., d, can be coded by a 
primitive recursive function of the a;, namely, q of p. 158. Indeed, we can code 
variable length sequences bo, ..., bn, by appending the length information, 
n, at the beginning of the sequence. 


1.9.20 Definition. (Following Shoenfield (1967).') We now revise the symbol 
(ao, .--, 4-1) to mean, for the balance of this chapter, the smallest a such that 


B(a,0)=n and P(a,i+1l)=a;4+1 fori<n 


We re-introduce also Seq(z), (z);, *, and lh, but to mean something other than 
their homonyms on p. 136 (this time basing the definitions on the 6-coding). 
We let Jh(z) = B(z, 0) and (z); = B(z,i + 1)— 1. Thus A(z,i + 1)>0 
implies 6(z,i + 1) = (z); + I. 
Seq(z) will say that z is some “beta-coded” sequence. We have analogous 


expectations on the “structure” of the number z (as in the case of the “Seq” of 
p. 136) as dictated by 1.9.20. Thus a z is a “Seq” iff 


(1) the sequence terms have been incremented by 1 before “packing” them 
into z, and 
(2) any smaller x (x < z, that is) codes a different sequence.‘ 


That is, 


Seq(z) @ Wi<mpBZ,i+ D> OA VX) <; 
(h(x) A IA(z) V i) cine (Zi F (a) 


We also (re)define (see also II.2.16 and II.2.21) 


ax b = (uz)(Seq(z) A [h(z) = lh(a) + [h(b) A 
(Vi)<in(ayi > O > (z)i = (AI) A 
(Vi<inpyG > 0 > (Z)itinay = (0)i) 


It is clear that (dp, ..., Gn—1) * (Do, ---, Dm—1) = (do, ---, An—1; D0, .--, Dm—1)- 


Note also that Seg(z) implies (z); < z for i </h(z). 


+ See however I1.2.14, p. 242. 
= zis the smallest number that codes whatever it codes. 
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As was already suggested on p. 155, if" 1: Y — N is any reasonable 1-1 
total mapping from a finite symbol alphabet 7 to the natural numbers then we 
can use (...) to extend it to a Gédel numbering" 7: 7+ > N.t 

Rather than attempting to implement that suggestion in the most general 
setting, we will shortly show exactly what we have in mind in the case of 
languages appropriate for formal arithmetic. 

We now fix an alphabet over which we can define a hierarchy of languages 
where formal arithmetic can be spoken. For the balance of the chapter, the 
absolutely essential nonlogical symbols: will be indicated in boldface type: 


0,S5,+,x,< (NL) 


The entire alphabet, in the following fixed order, is 


0, LO, ZA, =; “TT, V, A, G ), # (1) 


The commas are just metalogical separators in (1) above. All the rest are formal 
@ symbols. The symbol “#” is “glue”, and its purpose is to facilitate building an 
unlimited (enumerable) supply of predicate and function symbols, as we will 
shortly describe. 
The brackets have their normal use, but will be also assisting the symbols 
”, “A”, and “CE” to build the object variables, function symbols, and 
predicate symbols of any language in the hierarchy. 


“ 


The details are as follows: 


Variables. The argot symbol “vg” is short for “(LJ)”, “v1” is short for 
“( y’, and, in general, 


jt+1 


= 
“e 1b eed oe ” 
v;” is short for “(11... F) 


Function symbols. The argot symbol “f;” — the jth function symbol 
(j = 0) with arity n > 0 —is a short name for the string 


j+1 n 
a e_aReo——= 
“(A... AHA... AY” 


+ 9+ denotes the set of all nonempty strings of 7”. 
That is, all languages where arithmetic is spoken and practised formally will include these 
symbols. 
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Predicate symbols. The argot symbol “P’;” — the jth predicate symbol 
(j => 0) with arity n > 0 —is a short name for the string 


wo. n 
“O..0 OHO...O Oy 


Some symbols in the list (NL) are abbreviations (argot) that will be used 
throughout, in conformity with standard mathematical practice: 


S names (A#A), that is, fo 
+ names (A#AA\), that is, fe 
x names (AA#AA), | that is, tT; 
< names (O#OO), _ thatis, P? 


1.9.21 Definition. A language Ly of arithmetic is a first order language over 
the alphabet (1), with the stated understanding of the ontology of 2;’s, f;"’s, 
and P,"’s. 

Lg must contain the nonlogical symbols (NZ). 

We will normally write (argot)t <<s,t-+s andt Xs (t,s are terms) rather 
than <ts, + ts, and Xts respectively. 


The subscript 2( (for “2lrithmetic”) in Ly reflects the particular standard 
structure (appropriate for the language) “2l” that we have in mind. In what 
follows we will study extensively the language for the standard structure' Nt = 
(N;0;.S,+, x; <). 

Our notation for the language under discussion will normally indicate the 
structure we have in mind, e.g., by writing Lm. 

Ls has no nonlogical symbols beyond the ones listed under (NL). 

Still, occasionally we will just write Ly regardless, and let the context reveal 
what we are thinking. & 


The “natural” structures for extensions of Ls; will be expansions of St (see 
Definition 1.5.3, p. 54). Such expansions will normally be denoted by Tt’ = 
(N; 0; S,+, x,...3<,...), or just St’. We will never need to specify the “...” 
part in Tt’. 

The extended language, correspondingly, may be denoted by Lm, if the 
correspondence needs to be emphasized, else we will fall back on the generic 
Ly. 


+ Also sometimes called the “standard model”, although this latter term implies the presence of 
some nonlogical axioms for Jt to be a model of. 
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Note that all languages Ly will have just one constant symbol, 0. 

Each bold nonlogical symbol of the language is interpreted as its lightface 
“real” counterpart (as usual, S = Ax.x + 1), that is, 0% =0, <<" =<, S™=S, 
4+% = +, etc. 


Of course, in any expansion Nt’, all we do is give meaning to new nonlogical 


symbols leaving everything else alone. In particular, in any expansion it is still 
the case that 07 =0, <<™ =<, S™=5,4+™ = +4, ete. 


® Gédel numbering of Ly: At this point we fix our Gédel numbering for all 
possible languages Ly. Referring to (1) of p. 166, we assign 


QO! = (0, 1) 
[Fe Pl (0, 2) 
"AT= (0, 3) 
"O '= (0, 4) 
re = (0, 5) 
ra = (0, 6) 
"yt = (0, 7) 
"A! = (0, 8) 
"(! = (0, 9) 
")! = (0, 10) 
#! = (0, 11) 
We then set, for all j,n in N, 
jtl jtl 
rec wed De 0 be ee een ee 
i+ n+l j+l ntl 


ete tie oOo a 
(A. ABA ASOT As, a eA. Ary 


and 
j+l n+] 
m—_—_——_ o_o 
“(O...0#0...0)" 
gat n+1 
———— ——— ey 
= (3. Oe Oy tg GO) EO tees Oo DY) 


and by recursion on terms and formulas 


1. If g and Q are function and predicate symbols respectively of arity n, and 
t1,..-,t, are terms, then 


gt.’ = (8 ','t',.--,'tn') 
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and 
"Ot ee (QQ), "ty 'y..-, th!) 


Moreover, for terms ¢ ands, "t=s'=(( = 1,'t),"s'). 
2. If. 4 and .F are formulas, then 


MA 4)y= (A A) 
T(axy). 4) = (x A) 


and 


"(AV BY =I BBY 


Pause. What happened to the brackets in case 2 above? 


It is clear that, intuitively, this numbering is algorithmic both ways. Given a 
basic symbol ((1) of p. 166), term or formula, we can apply the above rules to 
come up with its Gédel number. 

Conversely, given a number n we can test first of all if it satisfies Seq (for all 
numbers “ ”” necessarily satisfy Seq). If so, and if it also satisfies /h(n) = 2, 
(n)o = O and | < (m); < 11, then n is the Gédel number of a basic symbol (1) 
of the alphabet, and we can tell which one it is. 

If now (7)o = 1, we further check to see if we got a code for a variable, and, 
if so, of which “v;”. This will entail ascertaining that /h(n) > 4, that (m), “is”? 
a“(?, ()inny—-1 18 a “)”, and all the other (1); are C’s. If all this testing passes, 
then we have a code for v; with i =lh(n) — 4. 

We can similarly check if n codes a function or predicate symbol, and if so, 
which one. 

As one more example,' let Seg(n), h(n) = 3 and (n)o = 31. Then we 
need to ascertain that (1), is some v; and — benefiting from an I.H. that we can 
decode numbers <n — that (1) is some formula .4. 


Pause. This I.H. hinges on the crucial “if Seg(n), then (1); <n”. 
We have thus decoded n into ((Sv;). 4). 


We note the following table, which shows that codes of distinct constructs 
do not clash. 


+ More accurately, “codes”. 
= We will formalize this discussion in the next chapter, so there is little point to pursue it in detail 
here. 
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Potential expression 


(n)o = 0 
(n)o = 1 
(n)o = 2 
(n)o = 3 


(n)o > 3A (1)o)o = 2 
(n)o > 3A ()o)o = 3 
(n)o > 3A ((1)o)o = 0 


((1)o)1 = 5 
((1)o)1 = 6 
((1)o)1 =7 
((1)o)1 =8 


Basic symbol 
Variable 
Function symbol 
Predicate symbol 
Term 

Atomic formula 


Atomic formula (=) 
Negation 

Disjunction 

Existential quantification 


Note that (1) o > 3 for those items claimed above stems from (z); < z for 


codes z, and the presence of, say, “ 


— '(C' > 9 —in the syntax of func- 


tions and predicates, while the symbols =, =, V, 4 have each a code greater 


than 5. 


Having arithmetized Ly, we can now apply the tools of analysis of number 
sets to sets of expressions (strings) over Ly. Thus, 


1.9.22 Definition. We call a set A of expressions over Ly constructive arith- 
metic, primitive recursive, recursive, semi-recursive, or definable over Ly (in 
some structure) iff {x 1: x € A} is constructive arithmetic, primitive recursive, 


recursive, semi-recursive, or definable over Ly, respectively. 


We have seen that one way to generate theories is to start with some set 
of nonlogical axioms I and then build Thmry. Another way is to start with a 
structure IN and build .7(M) = (4: Hoy 4}.i 

We will be interested here, among other theories, in the complete arithmetic, 
that is, .7 (Nt). One of our results will be Tarski’s undefinability of arithmetical 
truth, that is, intuitively, that the set .7 (9%) is not definable in the structure Nt. 

Recall that (refer to Definition I.5.15, p. 60) R C M” is definable over L in 
M = (M, .7) iff, for some formula .# (x1,...,X,) of L, 


Eo Ai, ..- 


where i j are imported constants (see 1.5.4). 


R(ij,...,in) iff ,in) foralli; ¢ M 


7 Cf. 16.1. 
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We say that .#(x1,...,%n) is a regular formula for R iff the variables 
X1,...,X, are the formal variables vo, ..., Un_1. Applying “dummy renaming” 
(1.4.13) judiciously, followed by the substitution metatheorem (1.4.12), we can 
convert every formula to a logically equivalent regular formula. For example, 
V13 = Vo9 is not regular, but it says the same thing as the regular formula 
V0 = U1, 1.e, V13 = U99 i Vo = U1 and Vo = V1 f= VU13 = V99. 

Henceforth we will tacitly assume that formulas representing relations are 
always chosen to be regular. 


Turning our attention to St’, we have an interesting variant to the notion of 
definability over (any) Ls’, due to the richness of Ly in certain types of terms. 
n 
eye a . 
1.9.23 Definition (Numerals). A term such as “S...S0” for n > 1 is called 


a numeral. We use the shorthand notation 7 for this term. We also let (the case 
of n = 0) 0 be an alias for the formal 0. 


n must not be confused with the imported constants 7 of Ly(M) (or Ly (Nt’)). 
The former are (argot names of) terms over Ly; the latter are (argot names of) 
new constants, not present in the original Ly (which only has the constant 0). © 


The usefulness of numerals stems from the following trivial lemma. 
1.9.24 Lemma. For alln ¢ N,n™ =n. 


Proof. Induction on n. For n = 0, O™=0" =0. 


Now, 
—— errs 
n+1™=(SS...80)” 
= Si") 
= S(n) _ since S* = S, using the LH. 
=n+1 


1.9.25 Corollary. For alln e N, Emin = 7. 


ore for any formula .4 over Ly and any n, Ky .4[7] <— .4[n], since 
En=n— (4n] o-.4[N)). 
Therefore, a relation R C N” is definable over Ly in an appropriate expan- 
sion of St, say Nt’, iff there is a regular formula .# in Ly such that, for all mj 
in N, 


R(m,...,1Mn) iff Fay 4m, ..., mn) (D) 
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The above expresses definability over Ly without using any “syntactic materi- 
als” (such as imported constants 7) that are outside the language Ly. 


Lemmata I.9.24—I.9.25 and (D) above go through, of course, for any expan- 


sion Jt’ of Nt. 7 


The following lemma will be used shortly: 
1.9.26 Lemma. The function An.“7" is primitive recursive. 


Proof. 
TO? = (0,1) 


ae — (TS), 7) 


where, of course, "S77 = (2,77, "A1, "#7, "AT, "))). 


Next, through the concept of definability over the minimum language Ly, 
we assign (finite) string representations to each arithmetic formula. Through 
this device, and Gédel numbering of the string representations, we assign (in- 
directly) Gddel numbers to those formulas. Thus, we can speak of recursive, 
semi-recursive, arithmetic, etc., subsets of (arithmetic) relations. 


1.9.27 Definition. The set Ag(Lq) of formulas over Ly is the smallest set of 
formulas over the language that includes the atomic formulas and satisfies the 
closure conditions: If .4 and .7 are in Ao(Ly), then so are 4.4, .4 V .#, and 
(Ax) <4, where “(Ax)<;.4” is short for (Ax)(x <tA.4). 

In (Ax) <;.4 we require that the variable x does not occur in the term ¢. In 
the case that Ly = Ly we often simply write Ag rather than Ag(Ls). 


em do hope (by virtue of choosing boldface type for Ag) that we will not 
confuse this set of formulas over Ly with the Ap relations of the arithmetic 


hierarchy. © 


1.9.28 Lemma. Every relation in €2 is definable in Nt over Ly (in the sense 
of (D) above) by some formula in Ag. 


Proof. We do induction on €2l. The basis contains two cases, z = x + y and 
Z=xXy. 
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vg = V1 + U2 defines z = x + y, since for any a, b,c inN, 


G@=b+0)"=to @=b+o"=t 
ogra’ 4+%e 
oa=b+ec 


An analogous case can be made for z = xy. 
We leave it to the reader to verify that if R and Q are defined by .#% and 7 
respectively, then —R and R Vv Q are defined by —.% and .#% V @ respectively. 
Finally, we show that Gia kG, x) is defined by (Ax)<y.%, that is, 


(Su, 44) (r+1 < vA (vy, seg Urs Ur+1)) 


Fix bo, ..., b, inN. Trivially, x < y defines x < y; thus (using the I.H. for R) 
for any fixed a in N, 


a<boAR(b,,a) iff Eg a<byAA(h,...,b,,a) (i) 


Let (Ax) <p, Rr, x). Then the left side of “iff” in (i) holds for some a; hence 
so does the right side. Therefore, 


Em a<boA#(hi,...,b-,a) (ii) 
Hence 
Ex Su, (r41 <bo A Abi... bp, Ursa) (iii) 


Conversely, if (i7) holds, then so does (ii) for some a, and hence also the right 
hand side of “iff” in (i). This yields the left hand side; thus (Ax) <p, R(b;, x). 


1.9.29 Lemma. Every relation that is definable in Xt by some formula in Ao 
over Lx is in BR,,. 


Proof. Induction on Ao. 

Basis. Anatomic formula ist = s ort <s. It is a trivial matter to verify 
that a relation defined by such formulas is obtained by Axy.x = yandAxy.x < y 
by a finite number of explicit transformations and substitutions of the functions 
Axy.x + y and Axy.xy for variables. We know that this does not lead beyond 
PBR,. 

Induction step. Relations defined by combining their defining formulas 
from Ao, using (the formal) 4, V,(Ax)<y also stay in PR, by the closure of 
the latter set under (the informal) —, v, (4x) <y. 
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The above lemma can be sharpened to replace BR, by €2l. This follows 
from results of Bennett (1962) (retold in Tourlakis (1984)), that €21 predicates 
are closed under replacement of variables by so-called rudimentary functions 
(Smullyan (1961), Bennett (1962), Tourlakis (1984)). © 


1.9.30 Theorem. A relation is arithmetic iff it is definable in N over Ly. 


Proof. If part. We do induction on the complexity of the defining formula. 
Atomic formulas define relations in $8%,,, as was seen in the basis step above 
(indeed, relations in €21 by the previous remarks), and hence in Yo (Defini- 
tion 1.8.60). Since arithmetic relations are closed under —, Vv, (4x), the property 
(that relations which are first order definable in %t are arithmetic) propagates 
with (the formal) —, V,(Sx). 

Only-if part. This follows because z = x + y and z = xy are definable 
(1.9.28), and definability is preserved by Boolean operations and (Ax). 


We now turn to Tarski’s theorem. 


1.9.31 Theorem (Tarski). The setT = {".41:.4 €.7(90} is not arithmetic; 
therefore it is not definable in N (over Ls). 


Proof. Suppose that T is arithmetic, say, 
(Ax.x eT) e€ &, UNL, 

Pick R(x) € Am+1— Em U Um (cf. 1.8.72), and let the regular formula .%(v9) — 
or, simply, .4 — define R over Lm. 

Clearly, the formula 

(Avo)(vo = v1 A. #(v)) 
also defines R(x), since 
= 2(v1) > (Avo)(vo = v1 A. #(v)) 

Thus, 


R(n) & Em vo)(vo =7A Z) 


2 aon ie A 7 (r = up a), 2) eT 
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By 1.9.26, g = An.(A 1 vg WAT = 1,0 001, 0), A) is in KA 
Thus, R(x) iff g(x) € T; hence R(x) € X, U Om, which is absurd. 


The trick of considering (Av9)(vo = V1 A.%(v9)) is due to Tarski. It simplifies 
the task of computing the Gédel number of (a formula equivalent to) .#(n) 
from n and the Gédel number of .#(v9). 

Gédel’s original approach to computing such numbers was more complica- 
ted, utilizing a substitution function — sub‘ (see next chapter) — that was analo- 
gous to Kleene’s later invention, the “S-m-n” functions of recursion theory. 


We have cheated a bit in the above proof, in pretending that “A” was a basic 
symbol of the alphabet. The reader can easily fix this by invoking De Morgan’s 


laws. 


We have just seen that the set of all formulas of arithmetic that are true in 
the standard structure St is really “hard’’.! Certainly it is recursively unsolvable 
(not even semi-recursive, not even arithmetic, ...). 

What can we say about provable formulas of arithmetic? But then, “provable” 
under what (nonlogical) axioms? 


1.9.32 Definition (Formal Arithmetic(s)). We will use the ambiguous phrase 
“a Formal Arithmetic” for any first order theory over a language Ly for arith- 
metic (Definition I.9.21), that contains at least the following nonlogical axioms 
(due to R. M. Robinson (1950)). 

The specific Formal Arithmetic over Ls that has precisely the following 
nonlogical axioms we will call ROB. The name “ROB?” will also apply to the 
list of the axioms below: 


ROB-1. (Regarding S) 

Sl. —=S$x = 0 (for any variable x) 

S2. Sx = Sy — x = y (for any variables x, y) 
ROB-2. (Regarding +) 

+1. x«x-+0= x (for any variable x) 

+2. x+Sy = S(x + y) (for any variables x, y)! 


¥ It is trivial to see that the AX.(x) that was introduced on p.165 is in 9%. With slightly more effort 
(Exercise I.106) one can see that it is even in $B, and therefore so is g. 

The function sub is primitive recursive, of two arguments, such that". 4(7) | = sub(g, n), where 

_g 7 

& = Avo). 

8 These are the formulas we often call “really true”. 

4 If we had stayed with the formal notation of p. 167, we would not have needed brackets here. 
We would have simply written +xSy = $+ xy. 
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ROB-3. (Regarding x) 
x1. x xX 0 = 0 (for any variable x) 
x2. x x Sy =( X y)+x (for any variables x, y) 
ROB-4. (Regarding <) 
<1. 7.x < 0 (for any variable x) 
<2. x <Syox<yVx=y (for any variables x, y) 
<3. x<yVx=yVy <z (for any variables x, y). 


Let us abstract (i.e., generalize) the situation, for a while, and assume that 
we have the following: 


(1) An arbitrary first order language L over some finite alphabet 7 (not nec- 
essarily the one for formal arithmetic). 

(2) A Gédel numbering” 7 of 7’* that we extend to terms and formulas, exactly 
as we described starting on p. 168, where we extended the specific Gédel 
numbering on the alphabet of arithmetic (p. 166). We are continuing to base 
our coding on the () of p. 165. 

(3) A theory I over LZ with a recursive set of axioms, I’, that is, one for which 
the set A = {".F ':.F% €T} is recursive. 

(4) The theory I’ has just two rules of inference, 


4 
I, : Pp 

4, B 
I: Zz 


(5) There are recursive relations I(x, y), h(x, y, z), corresponding to I, and 
I,, that mean 


LC“ VY) iff @2-A vial 
and 
BCOAV YS") iff £2YFS viakh 
That is, intuitively, we require the rules of inference to be “computable” (or 
algorithmic). 


We call such a theory recursively axiomatized. 

A proof over I’ is a sequence of formulas.4,,.72,...,.4,, and itis assigned 
the Gédel number (7.4, 1,".421,...,°4, 1). 

The following is at the root of Gédel’s incompletableness result. 
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1.9.33 Theorem. /f IT is a recursively axiomatized theory over L, then Thmrp 
is Semi-recursive. 


Of course, this means that the set of Gédel numbers of theorems, © = {".4": 
4 € Thmp}, is semi-recursive. © 


Proof. We set A = {'.4':.4@€T}and B = {".4':.4€ A}, where A is 
our “standard” set of logical axioms. We defer to the reader the proof that B is 
recursive. 


& Actually, we suggest the following easier approach. Show that the set of Gédel 
numbers of an equivalent set of logical axioms, A» (Exercises I.26-I.41) is 
recursive (indeed, primitive recursive). © 


Let Proof(u) mean that u is the Gédel number of a ’-proof. Then 
Proof (u) <> Seq(u) A (Vi) cin ( eAV(u) Ee Bv 
€2pah();, wi) Vv 
(Ap<aGk)aih(wj, We. (w))) 


Thus, Proof (u) is in Rx. 
Finally, if © = {".4':T + .4} (the set of Gédel numbers of I'-theorems), 
then © is semi-recursive, since 


xEOo (Au) (Proof (u) A i) <inyX = (u)i) 


1.9.34 Corollary. The set of theorems of any recursively axiomatized formal 
arithmetic is semi-recursive. In particular, the set of theorems of ROB is semi- 
recursive. 


1.9.35 Definition. Smullyan (1992). Let us call a formal arithmetic T over Lyx 
correct iff [ C.7(M). 


Many people call such an arithmetic “sound”. We opted for the nomenclature 
© “correct”, proposed in Smullyan (1992), because we have defined “sound” in 
such a manner that all first order theories (recursively axiomatizable or not) are 
sound (that is due to the “easy half” of Gédel’s completeness theorem). 
Correctness along with soundness implies Thmr C.7 (9). S 


Thus, we can rephrase the above corollary as 


1.9.36 Corollary (Gédel’s First Incompleteness Theorem). [Semantic ver- 
sion]. ROB is incompletable as long as we extend it correctly and recursively 
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over the language Ly. That is, every correct recursively axiomatized formal 
arithmetic T over Ls is simply incomplete: There are infinitely many sentences 
F over Ls that are undecidable by T. 


Proof. Since I is correct, Thmr C .7 (9). This is an inequality because Thmy 
is semi-recursive, while .7 (9) is not even arithmetic. 
Every one of the infinitely many sentences .¥ €.7 (0) — Thmr, 


Pause. Why “infinitely many’? 


is unprovable. By correctness, —.Y is not provable either, since it is false. 


1.9.37 Remark. (1) ROB has St as a model. Thus it is correct and, of course, 
@ consistent. 

(2) There are some “really true” sentences that we cannot prove in ROB (and, 
by correctness of ROB, we cannot prove their negations either). For example, 
(Vx)Ax < x is not provable (see Exercise I.108). 

Similarly, (Vx)(Vy)x + y = y + x is not provable (see Exercise I.109). 

Thus, it may not have come as a surprise that this formal arithmetic is 
incomplete. Gédel’s genius came in showing that it is impossible to complete 
ROB by throwing axioms at it (a recursive set thereof). It is incompletable. 

(3) It turns out that there is an “algorithm” to generate sentences.¥ € .7(S0H— 
Thmry for any recursive I that has been “given”, say as a W,, (that is, W,, = 
{(F 1:F €T}). 

First of all (cf. proof of I.9.33), we revise Proof (u) to Aum. Proof (u, m), 
given by 


Proof (u, m) <> Seq(u) A (Vi) <inao( (4z)T(m, (u);, z) V (uw); € BY 
€pah(@;, Wi) Vv 
(Aj) <i(4k) <ih(u)), We, (u))) 


We set ©, = {"-41: Tum) & -4}, where the subscript (m) of I simply reminds 
us how the axioms are “given”, namely: W,, = A= {UF ':.F € Tw}. 
Then x € O, <— (Gu)(Proof (u,m) A x = (W)incuy—1)> a semi-recursive 
relation of x and m, is equivalent to “{a}(u, m) |” for some a, and therefore to 
“{Si(a,m)}(u) |”. Setting f(m) = S}(a, m) for all m, we have 


There is a primitive recursive function f such that On = Wyn) (*) 
Pause. A by-product worth verbalizing is that if the set of axioms is semi- 


recursive, then so is the set of theorems, and from a semi-recursive index of the 
former, a semi-recursive index of the latter is primitive recursively computable. 
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Let now .7% (v9) define Ax.7=(4z)T (x, x, z). Clearly, it is also the case that 
AGT, x,2) iff Fm Gvo)(v0 =F 0 H (vo) 


We next consider the set K = {"(Aup)(vp =X A FH (vp) 2 x € K}. 
Set, for all x, 


g(x) =" (Avo)(v9 =X A Hv)" 
Then g € BR, for the reason we saw in the proof of 1.9.31, and 
xeK iff gix)eK 


We also consider the set O = {"(Avp)(vp =X A AH (v9))| : x € N}. Since 
x € Niff g(x) € Q, and g is strictly increasing, Q is recursive (Exercise I.95). 

Actually, all we will need here is that Q is semi-recursive, and this it readily 
is, since it is r.e. (Definition I.8.50) via g (strictly increasing or not). Trivially, 


K=QnT (=) 


where, as in I.9.31, we have set T= {".4': 4€.7(M}. 


We make a few observations: 


(i) If W; C K, theni € K — W,; (else,i € K 1 K). Sets such as K that come 
equipped with an algorithm for producing counterexamples to a claim 
K = W;, are called productive (Dekker 1995). That is, a set S is productive 
with productive function h € Riff for alli, W; C S implies h(i) € S— W;. 
In particular, h = Ax. x works in the case of K. 

(ii) We saw that K = eg [K ], where g~![.. . ] denotes inverse image. We show 
that K is productive as well. We just need to find a productive function for 
K. Let W; C K. Then g~![W;] € g~'[K] = K. By the S-m-n theorem, 
there is anh € $B such that ge! [W;] = Way). Indeed, 


xéeg [Wi] > g(x) € W; 
= {i}(g(x)) J 


=> fe}(x,1) J for some e (why?) 


<= {Si(e, D}(x) 1 


Take h = Ax.Si(e, x). Thus, h(i) € K — Wii) = eg [K — W;] (g is 1-1). 
Therefore Ax.g(h(x)) is the sought productive function. 

(iii) We finally show that T is productive, using (:) above. First, we ask the 
reader (Exercise I.98) to show that there is a k € $B such that 


Wii) = Wi Q (1) 


180 I. Basic Logic 


(recall that Q is semi-recursive). We can now find a productive function 
for T. Let W; C T. Thus, Wi) C TN O = K, from which 


g(h(k(i)) €TNO-WiNG=TNO-W,CT-W; 


We return now to what this was all about: Suppose we have started with a 
correct recursively axiomatized extension of ROB, I’ (over Ly), where W; is 
the set of Gddel numbers of the nonlogical axioms. By (x), the set of Gédel 
numbers of the I"-theorems is W,;). By correctness, Wi) C T. 

Then g(h(k(f(i)))) is the Gédel number of a true unprovable sentence. A 
g-index of Ai.g(hA(k(f (i)))) is an algorithm that produces this sentence for any 
set of axioms (coded by) i. © 


We saw in I.9.37(2) above that ROB cannot prove some startlingly simple 
(and useful) formulas. On the other hand, it turns out that it has sufficient power 
to “define syntactically” all recursive relations. We now turn to study this phe- 
nomenon, which will lead to a syntactic version of Gédel’s first incompleteness 
theorem. 


1.9.38 Definition. Let P be a formal arithmetic over some language Ly(D> Lm). 
We say that a relation R C N’ is formally definable in T, or V'-definable, iff 
there is a formula. 4(vo,...,V,-—1) over Ly such that, for all a, in N, 


if RG), then TH AG,....&) 
and 
if =R(a,), then THA&(G,...,a,) 


Of course, the left “—”’ is informal (metamathematical); the right one is formal. 


We say that .7 formally defines R, but often, just that it defines R. The 
context will determine if we mean formally or in the semantic sense. 


The terminology that names the above concept varies quite a bit in the literature. 
Instead of “formally definable” some say just “definable” and let the context 
fix the meaning. Out of a fear that the context might not always successfully 
do so, we added the obvious qualifier “formally”, since this type of definability 
is about provability, while the other one (1.5.15) is about truth. Terms such 
as “representable” (or “I"-representable’”’) are also used elsewhere instead of 
“formally definable”. 


It is clear that if R is ROB-definable and T extends ROB (over the same or 
over an extended language), then R is also I’-definable. er 
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1.9.39 Lemma. x = y is ROB-definable. 


Proof. We show that vg = v1 defines x = y. 
Let a =b (be true). Thus @ and Db are identical strings over Ly. Therefore 
+ a@=b by substitution in the logical axiom x = x. 


@m the present sequence of lemmata regarding the power of ROB we abuse 


notation and simply write “fF ...” rather than “ROBE ...” or “FRop...”. & 
Let next a ~ b. We use (metamathematical) induction on b to show 
=a =b. 
Basis. Ifb =0,thena = c+1 for some c, under the assumption. We want 
+ =e +1=0, ie.,  -S%=0, which we have, by axiom $1 (1.9.32) and 
substitution. 


Induction step. Wenow gotothecasea 4 b+ 1. Ifa = 0, then we are back 
to what we have already argued (using x = y > y = x). Letthena=c+1. 
Thus, c # b, and hence (1.H.), =¢=b. By $2 (and Ftaut) | -~SC= Sb, iLe., 
bre+1=b +1. 


1.9.40 Corollary. Lett be a term with the free variables vo, ...,Un—1 and f a 
total function of n arguments such that, for all Gy, b, if f (Gn) = b, then 


KtGi,...,&)=b 
Then the formula t(vo, .. +, Un—1) = Un defines the graph of f. 


Proof. It suffices to show that if f(a,) 4 b, then 


b tG,...,G,)=b (1) 

Well, the assumption means that for some c 4 b, f(Gn) = c Cf is total), and 
hence t(q,...,d;,)=C. 

If we add ¢(a),..., @)=b to our assumptions, then + C= b (by F x= 


y—y =z—x =Z and substitution), which contradicts | ~c = b, yielded 
by c ¥ b. Via proof by contradiction we have established (1). 


© A function suchas f above is term-defined by t (in ROB). Suppressing mention 
of t, we just say “f is term-definable” (in ROB). ee 


1.9.41 Lemma. x + y = z is ROB-definable. 


Proof. The formula vp + v1 = V2 fills the bill. Indeed, by Corollary 1.9.40 we 
only need to prove that 


at+b=c implies Ha+b=7 (2) 
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We do induction on b. 
Basis. Let b = 0. Then, a = c; thus 


Fa=c (3) 


By axiom +1 (1.9.32) and substitution, + @-+0=@. Transitivity of formal 
equality and (3) yield @ + 0 =@. 

Induction step. Leta+(b+ 1) =c. Then c = d+ 1 for some d; hence 
a+b=d.ByLH., 


ba+b=d 
Hence 


+ S@+b)=Sd (4) 


by the Leibniz axiom (substitution and modus ponens). 
By axiom +2, we also have (via substitution) 


La@+Sb=SG@+b) 
so that transitivity of equality and (4) result in 
+ G+ Sb=Sd 


——_ 


thatis, K@+5 + 1—d +1. 


1.9.42 Lemma. x x y = z is ROB-definable. 


Proof. Exercise 1.110. 


1.9.43 Lemma. x < y is ROB-definable. 


Proof. By induction on b we prove simultaneously 

a<b implies Ha<b (i) 
and 

aXb implies H-7a@<b (ii) 


Basis. For b = 0, (i) is vacuously satisfied, while (ii) follows from axiom <1 
and substitution. 

Induction step. Leta <b+1.Thus,a <b ora = b. One case yields (I.H.) 
+ @ <b, and the other @=b (1.9.39). 
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By tautological implication, + @ <b V @ =D in either case. Hence t @ < 
Sb, by substitution, Eraut, and axiom < 2. That is, / @< b+1. 

Let nexta £4 b+ 1. Thus, a 4 b anda F b. Thus we have both 7a < b 
(by IH.) and ~a = b (by 1.9.39); hence (E taut) 


b AG@<bVa=b) (iii) 


Via the equivalence theorem (1.4.25), E-raut, and axiom <2, (iii) yields F 
aa < Sb, that is, -a< b+ 1. 


1.9.44 Lemma. For any formula .~ and number n, ROB can prove 
0), ...,.4n — 1] kx <i—.4 


where n = 0 means that we have nothing nonlogical to the left of “\-” (beyond 
the axioms of ROB). 


Proof. Induction on n. 

Basis. Ifn=0, then we need+ x <0—.~4, which isa tautological impli- 
cation of axiom <1. 

Induction step. We want 


AO],...,.Anlex < Si.4 
By axiom <2 and the equivalence theorem (1.4.25) we need to just prove 
AO],...,.AnlE@ <aivx =n).4 


This follows at once by proof by cases (1.4.26) since AO], eran An —1] 
x<n—.4(byLH.) and. 4[n] -x =n —.4 by tautological implication from 
the Leibniz axiom and modus ponens. 


© By the deduction theorem we also get 
ROBE. 40] —--- >. 4n— lo x<i-.Z ® 
1.9.45 Lemma. ROB-definable relations are closed under Boolean operations. 
Proof. If R and Q are defined by .# and @ respectively, then it is a trivial 


exercise (Exercise I.111) to show that —R and R v Q are defined by —.% and 
Se N C respectively. 


1.9.46 Lemma. ROB-definable relations are closed under explicit transforma- 
tions. 
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Proof. (Mostly a task for the reader, Exercise I.112). 

Substitution by a constant. Let R(z, Xm) be defined by .#(Z, Xn) (where 
we have used syntactic variables z, Xj for vo, ..., Um): 

Then, for any a€N, R(a,Xm) is defined by .#(@,Xn). Indeed, for any 
bmn, if R(a, bm), then + AG, bi,...,bm), and if +R(a, bm), then AAG, 
by,..., Bm). 

Introduction of a new variable. Let Q(X,) be defined by ((X,) and let 
S(z, Xn) <> O(X,) for all z, X,, where z is a new variable. That is, S(z, X,) << 
O(Xn) \Z=z; hence it is definable (by @(%,) Az = Z, incidentally) due 
to 1.9.39, and 1.9.45. 


1.9.47 Lemma. ROB-definable relations are closed under (Ax) <y. 


Proof. Let Q(x, Zn) be defined by (x, Z,). We show that (Ax)<yO(x, Zn) — 
that is, 


(Ax)(x < yA Q(x, Zn) 
— defines (Ax) <) Q(x, Zn). 


Let (4x), Q(x, b,). Then, for some c <a, we have Q(c, b,). By 1.9.43, 
1.9.45, the assumption, and F-qaut, 


LE <AGA OE, by..., by) 
Hence (by the substitution axiom and modus ponens) 
fF (Ax) (x <K<aNO(x, b soot b,)) 
Next, let -(Ax)_, O(x, b,), that is, 
O(0, b,),..., O(a — 1, b,) are all false 
By assumption on Q, 
+ aQ(i, b1,..., by), fori =0,1,...,a—1 
Hence (1.9.44 followed by generalization) 
b (Vx)(x <a AO(x, by... , by) 
In short, removing the abbreviation “V”, and using the logical axiom (tautology) 
(x <G— Ox, by... bn) & A(X< AAC, bi... bn)) 


and the equivalence theorem, we have 


b A(x)(x <a A Ox, bi... bn) 


We have established 
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1.9.48 Proposition. Every relation in €2 is ROB-definable. 


1.9.49 Lemma (Separation Lemma). Let R and Q be any two disjoint semi- 
recursive sets. Then there is formula ¥(x) such that, for allm € N, 

meéR_ implies ROBE.¥(m) 
and 


méQ_ implies ROB' 7A¥(m) 


Proof. By 1.9.19, there are relations A(x, y) and B(x, y) in €2l such that 
meéR < (Ax)A(x,m) (1) 
and 
m € Q <> (Ax) B(x, m) (2) 
for all m. By 1.9.48, let. 4 and .# define A and B respectively. 
We define .Y(y) to stand for 
x)(Ax, y)AWz)(z <x > ~B;y))) (3) 


and proceed to show that it satisfies the conclusion of the lemma. 
Let m € R. Thenm ¢ Q; hence (by (1) and (2)), A(n, m) holds for some n, 
while B(i, m), for alli > O, are false. Thus, 


+ An, m) (4) 
andt 3.A(i, m), fori > 0.1 By 1.9.44, z<n— = A(z, m); hence by gene- 
ralization followed by - taut (using (4)), 

LAG, m) A (v2)(z <7 AAs, ii) 
The substitution axiom and F taut yield 
b (Ax)(. Ax, mi) A (W2)(z < x + 4.2, ii))) 
ie., EH .A(m). 


Let next m € Q. We want to show that  —¥(m), which — referring to 
(3) and using F taut, using the Leibniz rule, and inserting/removing the defined 
symbol ““V” — translates to! 


(¥x)(— Ax, 9) V G2\(z < ¥A.4@,9))) (5) 


t Throughout, “+” is short for “Rog”, of course. 
“Translates to” means “is provably equivalent to, without the assistance of any nonlogical 
axioms”. 
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The assumption yields m ¢ R; hence (by (1) and (2)), B(n, m) holds for some 
n, while A(i, m), for alli > O, are false. Thus, 


+ .#(n, m) (6) 


andt 7. A], m) fori > 0. The latter yields (by 1.9.44) x < Sn >. A(x, m), 
or, using the Leibniz rule and axiom <2, 


K(x<nVx =n) 3Ax,m) (7) 
(6) and Eqaut yield 1 <x > 7 <x A.A4(H, m); therefore 
bkin<x— (Az)(z <xXA.Bz, m)) (8) 


by the substitution axiom and F taut. 
Proof by cases (1.4.26) and (7) and (8) yield 


K(x<nVx =nVn<x)—> AAxX,m)V (Az)(z <x A.B(z, m)) 
Hence (axiom <3) 


b a. A(x, m)V (Az)(z <x A.B, m)) 


The generalization of the above is (5). 


The above lemma trivially holds for predicates of any number of arguments. 
That is, we may state and prove, with only notational changes (y for y, m for 
m, m for m), the same result for disjoint semi-recursive subsets of N”, for any 


n> 0. er 


1.9.50 Corollary. Every recursive predicate is definable in ROB. 


Proof. If R © N” is recursive, then it and N” — A are semi-recursive. 


We can say a bit more. First, a definition: 


1.9.51 Definition. A predicate R(x;,,) is strongly formally definable in T, or 
just strongly definable in 1, iff, for some formula .#(X,), both of the following 
equivalences hold (for all b, € N): 


Rib») iff TH .Ah,..., by) 
and 


aR(b,) iff THK AA(b,..., bn) 


We say that .% strongly defines R (nT). 


4 


[.9. Arithmetic, Definability, Undefinability, and Incompletableness 187 


Correspondingly, a total function f(X,) is strongly term-definable by t just in 
case, for all c, b, € N, 


RN 


fb,=c iff FEt(b,...,5,) = 
1.9.52 Corollary. Every recursive predicate is strongly definable in ROB. 


Proof. Let the recursive R C N" be defined (in ROB) by .#(%,,). We already 
have the — -directions for the positive and negative cases (by “weak” definabil- 
ity). We need the < -directions. 

Let then + .£#(7m,...,m,). If ~R(m,), then also 7A#(m)1,..., mn), 
contradicting the consistency of ROB.! Thus, R(m,,). A similar argument works 
for the negative case. 


We have at once 


1.9.53 Corollary. Every recursive predicate is strongly definable in any con- 
sistent extension of ROB. 


1.9.54 Definition. Two disjoint semi-recursive subsets R and Q of N are re- 
cursively inseparable iff there is no recursive set S such that R C Sand S$ C Q 


(where O = N — Q). 


1.9.55 Lemma. R = {x : {x}(x)=0} and Q={x : {x}(x) = 1} are recursively 
inseparable. 


Proof. Clearly R and Q are disjoint semi-recursive sets (why?). 
Let S be recursive, and suppose that it separates R and Q, ie., 


RESCO (1) 

Let {i} = x5. 
Assume that i € S. Then {i}(2) = 1, hence i € Q (definition of Q). But also 

i € Q, by (1). 


Thus, i € S, after all. Well, if so, then {i}(@i) = O, hence i € R (definition 
of R). By (1), i € S as well. 
Thus i ¢ SUS =N, which is absurd. 


+ Having adopted a Platonist’s metatheory, the fact that 9t is a model of ROB establishes consis- 
tency. A constructive proof of the consistency of ROB is beyond our scope. For an outline see 
Shoenfield (1967). 


<4 


<4 
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1.9.56 Theorem (Church’s Theorem). (Church (1936)). For any consistent 
extension of ROB, the set Thmr is not recursive. 


Proof. Let © = {".4':.4 € Thimr}, where I consistently extends ROB. 
Applying 1.9.49 to the R and Q of 1.9.55, we obtain a formula. of a single 
free variable ve such that 


meéR implies [+.¥(m) (1) 
and 
méQ implies [+ 7¥(m) (2) 


Let S = {m:T + .¥(m)}. By (1), R © S. By (2) and consistency, QN S = G, 
ie, SCO. 


By 1.9.55, 
SER, (3) 
By definition of S, 
meS iff [Th A(m) iff Th Au)w=mAF) (4) 


using the Tarski trick once more (p. 175). 

We already know that there is a primitive recursive g such that g(m)= 
"(Avp)(vp =m AF )' for all m. 

Suppose now that © € K,. Then so is S, since (by (4)) 


mes iff gimeO (5) 


This contradicts (3). 


The above theorem, published some three years after Gédel published his 
incompleteness results, shows that the decision problem for any consistent 
theory — not just for one that is recursively axiomatized — that contains arith- 
metic (ROB) is recursively unsolvable. In particular, we rediscover that .7 (9), 
which extends ROB consistently, is not recursive. Of course, we already know 
much more than this about .7 (9%). 

Church’s result shattered Hilbert’s belief that the Entscheidungsproblem (de- 
cision problem) of any axiomatic theory ought to be solvable by “mechanical 
means”. 

On the other hand, Gédel’s incompleteness theorems had already shown the 
untenability of Hilbert’s hope to address the consistency of axiomatic theories 
by finitary means. 


} Finitary means can be formalized within Peano arithmetic using, if necessary, arithmetization. 
However, the consistency of Peano arithmetic is not provable from within. 
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The following is a strengthened (by Rosser (1936)) syntactic version of 
Gédel’s first incompleteness theorem. It makes no reference to correctness; 
instead it relies on the concept of consistency. 

Gédel’s original syntactic version was proved under stronger consistency 
assumptions, that the formal arithmetic under consideration was w-consistent.' 


1.9.57 Theorem (Gédel-Rosser First Incompleteness Theorem) [Syntactic 
Version]. Any consistent recursive formal arithmetic T is simply incomplete. 


Proof. We work with the same R, Q, S as in I.9.55-1.9.56. By 1.9.34, © is 
semi-recursive, and hence, so is S by (5) in the previous proof. 

How about S? 

At this point we add the assumption (that we expect to contradict) that I" is 
simply complete. Then (referring to the proof of 1.9.56), 


completeness 


mes iff [iH FA(m) = Tb aF(m) 


consistency 


As in the proof of 1.9.56, setting, for all m, f(m) = "(Gup)(v =m A 7F )), 
we get 


mes iff f(m)e@ 


Since f is in BR, S is semi-recursive; hence S$ is recursive, contradict- 
ing 1.9.55. 


1.9.58 Remark. (1) The assumption that I’ is recursive is not just motivated by 

© convenience (towards showing that © is re.). After all, .7 (9D is a consistent 
and simply complete extension of ROB (but it is not recursively axiomatizable, 
as we know). 


(2) We can retell the above proof slightly differently: 


Let S’'={m:T + —¥(m)}. Then S’ is semi-recursive. SM S’ = & by consis- 
tency. Thus, § U S’ 4 N (otherwise S’ = S, making S recursive). 


Letn € N— SUS’. Thenl- (7) and lv AF(A). 


It is clear that the set N — SU S" is infinite (why?); hence we have infinitely 
many undecidable sentences. 

(3) For eachn € N— SUS", exactly one of .(n) and —F(n) is true (in Nt). 
Which one? 


+ A formal arithmetic is w-inconsistent iff for some formula.F of a single free variable x, (Ax). 
and all of —.7(m) — for all m > 0 — are provable. Otherwise, it is w-consistent. Of course, an 
q-consistent theory, as it fails to prove something, is consistent. For Gédel’s original formulation 
of the first theorem see Exercise I.116. 
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We have defined.¥ (y) ((3) on p. 185 originally; see also the proof of 1.9.56) 
to stand for 


(Ax)(A@,y) A (Wz) <x > 3.Bz,y))) 
where, with R and Q as in I.9.55-1.9.56, 
keR iff Em (Ax). Ax,’ 


andi 


keO iff Em (ax) A(x, k) 


We also note that 


E AF) (V¥x)(. Ax, y) V (Az) <x AAG, y))) (i) 
Now, ifn ¢e N— SUS’, thenn ¢ R; hence Ex, =(x). A(x, 7), that is, 
Em (Vx) A(x, n) (ii) 


Noting that (1.4.24, p. 50) 
E (Vx) F 3 (Vx EV FD) 
for any @ and Y, (ii) yields 
Em 77 (n) (iii) 
(4) w-Incompleteness of arithmetic. Now, (iii) implies 

Lm Wk,n)  forallk>0 

by (i) and substitution, where (x, y) abbreviates 
TAX, y)V (Az)(z <x A. A, y)) 

Thus, 

TE&WK,n)  forallk>0 


since A(x, y) and B(x, y) (p. 185) are in C2. 

Thus we have the phenomenon of w-incompleteness in any consistent (and 
recursive) extension I’ of ROB. That is, there is a formula.W (x) such that while 
T+ # (k) for all k > 0, yet TP  (Vx).AH (x). An example of such an 7 is 
GY (x,n). Therefore, Thmry for such Tis not closed under the “w-rule” 


KH), H(A), HQ), ... EK Wx) K(x) 


+ This and the previous equivalence are just restatements of (1) and (2) on p. 185. 
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(5) A consistent but w-inconsistent arithmetic. Next, consider the theory 
Il +.¥%(n), for some fixed n € N— SUS". Since T - 44(7), this theory is 
consistent. However, it is w-inconsistent: 

Indeed, by (ii), 


Lm Ak,7)  forallk>0 
Hence, for reasons already articulated, 
TKAAKk,a ~~ forallk >0 
and, trivially, 
T+.F%~atAAk,7) forallk>0 (iv) 


But also [ + .¥(n) | .F(H), so that, along with (iv), we have derived the 
w-inconsistency of T + .¥(n); for 


PD +.4%(n) + (Ax). 4,n) 
— the latter due to (Ax)(@ A WY) > (Ax) and the definition of .7. 


(6) A consistent but incorrect arithmetic. The consistent formal arithmetic 
I +.4%(N) is not correct, since Kx .F (7). 

(7) From what we have seen here (cf. also Exercise I.113) we can obtain 
an alternative foundation of computability via ROB: We can define a recursive 
predicate to be one that is strongly definable in ROB (1.9.52). 

We can also define a semi-recursive predicate P(X,) to be one that is posi- 
tively strongly definable in ROB, i.e., for some F and all a, 


P(Gn) iff ROBE AG, ...,a) 


We can then say that a partial recursive function is one whose graph is positively 
strongly definable, while a recursive function is a total partial recursive function. 

With this understanding, uncomputability coincides with undecidability (of 
ROB) and hence with its incomplete(able)ness (unprovability): There are sets 
that are positively strongly definable but not strongly definable (e.g., the set 
K; see also Exercise I.116). Thus, our claim at the beginning of this volume, — 
that not only is there an intimate connection between uncomputability and 
unprovability, but also you cannot have one without having the other — is now 
justified. 


1.10. Exercises 


I.1. Prove that the closure obtained by .7 = {3} and the two relations z = 
x + y and z = x — y is the set {3k :k € Z}. 
1.2. The pair that effects the definition of Term (1.1.5, p. 14) is unambiguous. 
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1.3. 


1.4. 


1.5. 


1.6. 


17. 


1.8. 


1.9. 
1.10. 


L111. 
1.12. 
1.13. 
1.14. 


1.15. 
1.16. 
1.17. 
1.18. 
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The pair that effects the definition of Wff (1.1.8, p. 15) is unambi- 

guous. 

With reference to I.2.13 (p. 25), prove that if all the gg and h are defined 

everywhere on their input sets (i.e., they are total) — these are 7 for h 

and A x Y" for gg and (r + 1)-ary Q — then f is defined everywhere on 

CIC, .#). 

Let us define inductively the set of formulas Ar by: 

(1) 1, 2, and 3 are in Ar. 

(2) If a and b stand for formulas of Ar, then so do a+ b anda x b. 
(N.B. It is the intention here not to utilize brackets.) 

(3) Define a function (intuitively speaking), Eval(x) by Eval(x) = x if x 
is 1, 2, or 3, and (inductive step) Eval(a + b) = Eval(a) + Eval(b), 
Eval(a x b)= Eval(a) x Eval(b) for all a, b in Ar, defined via (1) 
and (2). 


Show that the definition (1)-(2) above allows more than one possible 

parse, which makes (3) ill-defined; indeed, show that for some a € Ar, 

Eval(a) has more than one possible value, so it is not a function after all. 

Prove that for every formula .# in Prop (1.3.2, p. 28) the following is 

true: Every nonempty proper prefix (1.1.4, p. 13) of the string A has an 

excess of left brackets. 

Prove that any non-prime. 7 in Prop has uniquely determined immediate 

predecessors. 

For any formula .4 and any two valuations v and v’, 0(.4) = v'(.4) if 

uv and v’ agree on all the propositional variables that occur in . 7. 

Prove that .4[x < ft] is a formula (whenever it is defined) if t is a term. 

Prove that Definition I.3.12 does not depend on our choice of new vari- 

ables Z,. 

Prove that F (Vx)(Vy).4 <> (Vy)(Vx).4. 

Prove 1.4.23. 

Prove 1.4.24. 

(1) Show that x < yF y <x (< is some binary predicate symbol; the 
choice of symbol here is meant to provoke). 

(2) Show informally that 4 x < y > y <x (Hint: Use the assumption 
that our logic “does not lie” (soundness theorem).) 

(3) Does this invalidate the deduction theorem? Explain. 

Prove 1.4.25 

Prove that- x =y—> y=x. 


Prove that x =yAy=z> x =z. 
Suppose that "| 4; = s; fori = 1,...,m, where the ¢;, s; are arbitrary 
terms. Let.Y be a formula, and.¥’ be obtained from it by replacing any 


<4 


1.19. 


1.20. 
1.21. 
1.22. 


1.23. 


1.24. 


1.25 


In 
and I 


I.10. Exercises 193 


number of occurrences of f; in.¥ (not necessarily all) by s;. Prove that 
ThF<F’. 

Suppose that " - ¢; = s; fori = 1,...,m, where the ¢;, s; are arbitrary 
terms. Let r be a term, andr’ be obtained from it by replacing any number 
of occurrences of t; in r (not necessarily all) by s;. 

Prove that Fr =r’. 

Settle the Pause following [.4.21. 

Prove 1.4.27. 

Suppose that x is not free in .4. Prove that .4 —> (Vx).4 and F 
(Ax).4 > .4. 

Prove the distributive laws : 

F Wx). 4 A.B) > VWx)4A (WxX)B and - (Ax)C4V.A) Oo 
(Ax). 4 V (Ax).%. 

Prove + (Ax)(Vy).4 > (Wy)€x).4 with two methods: first using the 
auxiliary constant method, next exploiting monotonicity. 

. Prove F (Ax)(-4 > (Vx).4). 


what follows let us denote by A, the pure logic of Section 1.3 (1.3.13 


.3.15). 


Let us now introduce a new pure logic, which we will call Az. This is exactly 
the same as our A; (which we have called just A until now), except that we 


have 


a different axiom group Ax1. Instead of adopting all tautologies, we only 


adopt the following four logical axiom schemata of group Ax1 in A: 


(1) .4V.4->.4 

(2) 46> AV # 

3). 6@V Bo BV A 

(4) (46> 2B)? (EV 45> €V #) 


A» is due to Hilbert (actually, he also included associativity in the axioms, 
but, as Gentzen has proved, this was deducible from the system as here given; 
therefore it was not an independent axiom — see Exercise I.35). In the exercises 
below we write -; for ,,,i = 1, 2. 


1.26 


. Show that for all.¥ and every set of formulas I’, if [ +2. holds, then 
so does P Fy .F. 


Our aim is to see that the logics A, and A, are equivalent, i.e., have exactly the 


same 


theorems. In view of the trivial Exercise 1.26 above, what remains to be 


shown is that every tautology is a theorem of A. One particular way to prove 
this is through the following sequence of A>-facts. 


+ — and v are the primary symbols; >, A, < are defined in the usual manner. 


<4 


4 


e 


194 
1.27 


1.28. 
1.29. 
1.30. 
1.31. 
1.32. 


1.33. 
1.34. 


1.35. 


1.36. 


1.37. 
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. Show the transitivity of > in Az: 

42> B,B—-> Cb,.4A> for all.4, 2%, and @. 

Show that 2.4 > .4 (Le., ko 7.4 V4), for any 4. 

For all.4, 2 show that ..4> .2Vv.-A. 

Show that for all. 4 and .2,.412.2 > 4. 

Show that for all.4, > -7.4 > .4and'2.4> 7.4. 
Forall.4and.%, show thatl2 (.4 > .#) > (7.2 — —.4). Conclude 
that.4 > BlrzAB > 74. 

(Aint. 2 > a7. 4%.) 

Show that.4—> .#212(2 > @)—> (4-—-> @) forall 4, .%, &%. 
(Proof by cases in A>.) Show for all .4,.2, %, Y, 


if 
O 


4-> BLE > Dy ANG > BVG 


Show for all .4, .7, @ that 

(1) ky. 4@VOBV &) > C4V.A)YV & and 

(2) 2 (4V.B)VE> 4AV(BVE). 

Deduction theorem in “propositional” A». Prove that if [,.4 2 #2 
using only modus ponens, then also f2.4 > .@ using only modus 
ponens, for any formulas .4, .# and set of formulas I’. 

(Hint. Induction on the length of proof of .2 from I U {. 4}, using the 
results above.) 

Proof by contradiction in “propositional” A». Prove that if [,-.4 de- 
rives a contradiction in Az using only modus ponens,' then T Fz .4 
using only modus ponens, for any formulas .# and set of formulas I. 
Also prove the converse. 


We can now prove the completeness theorem (Post’s theorem) for the “propo- 
sitional segment” of Aj, that is, the logic A3 — so-called propositional logic (or 
propositional calculus) — obtained from Az by keeping only the “propositional 
axioms” (1)-(4) and modus ponens, dropping the remaining axioms and the 
d-introduction rule. (Note. It is trivial that if [ 3.4, then T 2.4.) Namely, 
we will prove that, for any.4 and T, if  Eqaut -Z, then Tb 3 4. 

First, a definition: 


1.10.1 Definition (Complete Sets of Formulas). A set I’ is complete iff for 


every .%, at least one of .4 or —.4 is a member of I’. 


1 That is, it proves some .7 but also proves 7.7. 
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1.38. 


1.39. 


1.40. 
L41. 
1.42. 


1.43. 
1.44. 
1.45. 
1.46. 
1.47. 
1.48. 


1.49. 
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Let I ‘3 .4. Prove that there is a complete A > T such that also 
A 43 .4. This is a completion of T. 

(Hint. Let.Fo,.F,,.F3, ... be an enumeration of all formulas. (There is 
such an enumeration. Define A,, by induction on n: 


Ag =T 
AnU{F,} if An ULF} 43.4 
An+1 = 4 otherwise 


To make sense of the above definition, show the impossibility of having 
both A, U {%,} F3 .4 and A, U{-%,} 3 .4. Then show that A = 
U,,>9 An is as needed.) 

(Post.) If T 4, then P b3 4. 

(Hint. Prove the contrapositive. If T 143 .4, let A be a completion 
(Exercise I.38) of I such that A I’; .4. Now, for every prime formula 
(cf. 1.3.1, p. 28) F, exactly one of F or —F (why exactly one?) is in A. 
Define a valuation (cf. 1.3.4, p. 29) v on all prime formulas by 


x 0 ifFEA 
Ua {; otherwise 
Of course, “0” codes, intuitively, “true”, while “1” codes “false”. To 
conclude, prove by induction on the formulas of Prop (cf. 1.3.2, p. 28) 
that the extension 0 of v satisfies, for all formulas .7, 0.7) = 0 iff 
2B € A. Argue that.4 ¢ A.) 
If l taut -Z, then TF 4. 
For any formula.¥ and set of formulasl’, one has lf, .Y iff! F2.F. 
(Compactness of propositional logic.) We say that I is finitely satisfiable 
(in the propositional sense) iff every finite subset of I" is satisfiable 
(cf. 1.3.6, p. 30). Prove that I’ is satisfiable iff it is finitely satisfiable. 
(Hint. Only the if part is nontrivial. It uses Exercise 1.39. Further hint: If 
I’ is unsatisfiable, then PF Eqaut 4 A —.4 for some formula . 7.) 
Prove semantically, without using soundness, that.4 — (Vx).4. 
Give a semantic proof of the deduction theorem. 
Show semantically that for all. Z and. 2,.4—> 2 (Vx).4 > (Wx).2. 
Show semantically that for all. 4 and.2,.4—> 2 & (Ax).4@—> (Ax).Z. 
Prove the claim in Remark 1.6.7. 
Prove that the composition of two embeddings @: It > Rand py: > 
£ is an embedding. 
Find two structures that are elementarily equivalent, but not isomorphic. 
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1.52. 


1.53. 
1.54. 
1.55. 
1.56. 


1.57. 


1.58. 


1.59. 


1.60. 


1.61. 
1.62. 


1.63. 
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. Let @ : St — Mt be an isomorphism. We say also (since we have the 
same structure on both sides of “—”’) that @ is an automorphism. Prove 
that if S C |9t|” is first order definable (cf. 1.5.15) in 92, then,for all i, 
in ||, 


(in) eS iff (@(i),.-., lin) €S 


. Prove that N is not first order definable in the structure SR = (R, <) 
(R is the set of reals). 

(Hint. Use Exercise I.50 above.) 

Prove that addition is not definable in (N, x). More precisely, show that 
the set {(x, y, z) € N?:z =x + y} is not definable. 

(Hint. Define a function ¢ : N > N by (x) = x if x is not divisible 
by 2 or 3. Otherwise ¢(x) = y, where y has the same prime number 
decomposition as x, except that the primes 2 and 3 are interchanged. 
For example, 6(6) = 6, 6(9) = 4, 6(12) = 18. Prove that @ is an 
automorphism on (N, x), and then invoke Exercise I.50 above.) 

Prove the only-if part of the Los-Tarski theorem (1.6.26). 

Prove Theorem 1.6.29. 

Prove the if part of Theorem 1.6.30. 

Prove by a direct construction, without using the upward Léwenheim- 
Skolem theorem, that there are nonstandard models for arithmetic. 
(Hint. Work with the theory Th(St) U {7 < C :n € N} where C is anew 
constant added to the language of arithmetic Ly = {0,8,+,x,<}. 
Use compactness and the consistency theorem.) 

Prove that if I has arbitrarily large finite models, then it has an infinite 
model. 

Prove by a direct construction, without using the upward Léwenheim- 
Skolem theorem, that there is a nonstandard model for all true first order 
sentences about the reals. 

Prove Proposition 1.6.45. 

Prove the pinching lemma (1.6.53). 

Conclude the proof of 1.6.55. 

Let N, a unary predicate of the extended language L of the reals, have 
as its interpretation N™ = N, the set on natural numbers. Use it to prove 
that there are infinite natural numbers in *R. 

(Hint. Use the true (in S8) sentence (Vx)(4y)(NV(y) A x < y).) 

Prove that if / is an infinite natural number, then so is / — 1. A side effect 
of this is an infinite descending chain of (infinite) natural numbers in *R. 
Hence there are nonempty sets of natural numbers in *IR with no mini- 
mum element. Does this contradict the transfer principle (p. 99)? Why? 


1.64. 


1.65. 


1.66. 
1.67. 


1.68. 


1.69. 


1.70. 
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Prove the extreme value theorem: Every real function of one real variable 
that is continuous on the real interval [a, b] attains its maximum. That 
is, (Ax € [a, bI)\(y € la, b)) f(y) < FO). 

(Hint. Subdivide [a, b]inton > O subintervals of equal length, [a;, aj+1], 
where aj = a+ (b — a)i/n fori = 0,...,. Formulate as a first order 
sentence over L the true (why true?) statement that “for all choices of 
n > 0 (in N) there is ani € N such that f(a + (b — a)i/n) is maximum 
among the values f(a+(b—a)k/n),k = 0,...,n’’. Transfer to *R. This 
is still true. Now take n to be an infinite natural number K. Let J be a (hy- 
perreal) natural number that makes f(a + (b — a)J/K) maximum among 
the f(a+(b—a)i/K),0 <i < K.Seeif f(st(a+(b—a)I/K)) is what you 
want.) 

Use the technique of the previous problem to prove the intermediate 
value theorem: Every real function of one real variable that is continu- 
ous on the real interval [a, b] attains every value between its minimum 
and its maximum. 

Prove the existence of infinite primes in *R. 

Let © be a pure theory over some language L. Form the theory {’ over 
LU {t} by adding the schemata 


(Wx).4 o .B) > (tx).4 = (tx).B 
and 
(Ax). 4[x] > -4[(tx).4] 


where T is anew symbol used to build terms: If .4 is a wff, then (tx).4 
is a term. Prove that £’ extends { conservatively (t may be interpreted 
as that of p. 123). 

Prove that in the presence of the initial functions of BR (1.8.10, p. 128) 
the following Grzegorczyk (1953) substitution operations can be simu- 
lated by composition (1.8.6): 

(a) Substitution of a function into a variable 

(b) Substitution of a constant into a variable 

(c) Identification of any two variables 

(d) Permutation of any two variables 

(e) Introduction of new (“dummy”) variables (i.e., forming Ax y.g(y)). 
Prove that if f is total and Axy.y = f(x) is in R,, then f €R. Is the 
assumption that f is total necessary? 

Show that both 589% and St are closed under bounded summation, 


DY y<z f(y, x) and bounded product, [],_, f(y, x). 
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1.71. Prove that if f is total and Axy.y = f(x) is in $8, (semi-recursive), then 
fen. 

1.72. Prove 1.8.27 using the if-then-else function rather than addition and 
multiplication. 

1.73. Prove that p, < 2?" forn > 0. 

(Hint. Course-of-values induction on n. Work with popi... Pn + 1.) 

1.74. Prove, without using the fact that An.p, € BR, that 7 € BHR, where the 
zc-function is given by z(n) = (the number of primes <7). 

1.75. Prove, using Exercise I.74 above, but without using the fact that An. p, € 
BR, that the predicate Ayn.y = p, is in BR,,. Conclude, using Exer- 
cise 1.73, that An.p, € BR. 

1.76. The Ackermann function’ is given by double recursion by 


A(O,x)=x+2 
A(n + 1,0) = 2 
A(n+1,x+1) = A(n, An + 1, x)) 


Prove that Anx.A(n, x) € W. 
(Hint. Define 


x+2 ifn =0 
F(z,n,x) = 42 ifn >OAx=0 
{z}(n — 1, {z}(n,x —1)) ifn >O0Ax>0 


Now apply the recursion theorem.) 
1.77. Prove that there exists a partial recursive h that satisfies 


ifx=y+l 


h = 
(y,%) a +1,x) otherwise 


Which function is Ax.h(0, x)? 
(Hint. Use the recursion theorem, imitating your solution to Exercise I.76 
above.) 

1.78. Prove that there exists a partial recursive k that satisfies 


ifx=y+l 


0 
k = 
(y%) i x)+1 otherwise 


Which function is A.x.k(0, x)? 


+ There are many versions, their origin part of computability folklore. The one here is not the 
original one. 
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Given Ayx. f(y, X) € 3B. Prove that there exists a partial recursive g that 
satisfies 


=, Jy if f(y, xX) = 0 
B(y, 2) = Ce +1,x) otherwise 


How can you express Ax.g(0, x) in terms of f? 

Prove that K = {x : {x}(x) |} is not a complete index set, that is, there 
isno  C $B such that K = {x : {x} € D}. 

(Hint. Show that there is an e € N such that {e}(e) = 0, while {e}(x) t 
ifx #e.) 


. Is {x : {x}(x) = 0} a complete index set? Recursive? Semi-recursive? 


(Your answer should not leave any dangling “why’’s.) 


. Let f eR. Is {x : { f(x) }(x) |} a complete index set? Why? 
. Consider a complete index set A = {x : {x} € D} such that there are two 


functions {a} and {b}, the first in D, the second not in D, and {a} C {b}. 
Prove that there is a 1-1 recursive function f such that 


xEK<o f(xyeA 


Conclude that A is not semi-recursive. 
(Hint. To find f use the S-m-n theorem to show that you can have 


{a}(y) if computation {a}(y) takes < steps than {x}(x) 


{fO}O) = ao otherwise 


The wordy condition above can be made rigorous by taking “steps” to 
be intuitively measured by the size of z in T(i, x, z). “<” is understood 
to be fulfilled if both {a}(y) and {x}(x) are undefined.) 

An A as the above is productive. 

Prove that there is an f € 8 such that W, 4 9 iff f(x) J, and W,. 4G 
iff f(x) € Wy. 

Selection theorem. Prove that for every n > 0, there is a function Aayp. 
Sel (a, ¥n) such that 


(Ax)({a}(x, Jn) 1) <> Sel™(a, Yn) 
and 
(Ax)({a}(x, Jn) 4) <> {a}(Sel™ (a, Yn), Yn) 


(Hint. Expand on the proof idea of Exercise 1.85.) 

Prove that f € 9 iff its graph Ayx.y = f(x) is in P,. 

(Hint. For the if part, one, but not the only, way is to apply the selection 
theorem.) 
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1.88. Definition by positive cases. Let R;(X,),i=1,..., k, be mutually exclu- 


1.89. 


1.90. 


1.91. 


1.92. 


1.93. 


1.94. 


1.95. 


1.96. 


1.97. 


1.98 
1.99 
1.100 


sive relations in %B,, and AX,.fj(X,),i=1,...,k, functions in %. Then 
f defined below is in 8: 


fiGin) if RiQn) 


f (Xn) a a =) 
Sin) tf Ren) 
t otherwise 


(Hint. The if-then-else function will not work here. (Why? I thought 8 
was closed under if-then-else.) Either use the selection theorem directly, 
or use Exercise I.87.) 

Prove that every re. relation R(x) can be enumerated by a partial recur- 
sive function. 

(Hint. Modify the “otherwise” part in the proof of 1.8.51.) 

Prove that there is an h € $B such thatW, = ran({h(x)}). 

(Hint. Use Exercise 1.89, but include x in the active arguments. Then use 
the S-m-n theorem.) 

Sharpen Exercise I.90 above as follows: Ensure that h is such that 
W,. # Y implies {h(x)} € KR. 

(Hint. Use Exercise I.85 for the “otherwise”.) 

Prove that for some o € BR, ran({x}) = dom({o(x)}). 

(Hint. Show that y € ran({x}) is semi-recursive, and then use S-m-n.) 
Prove that there is a t € $8 such that ran({x}) = ran({t(x)}) and, 
moreover, ran({x}) 4 @ implies that {t(x)} € KR. 

Prove that a set is recursive and nonempty iff it is the range of a non- 
decreasing recursive function. 

(Hint. To check for a € ran(f), f non-decreasing, find the smallest i 
such that a < f(i), etc.) 

Prove that a set is recursive and infinite iff it is the range of a strictly 
increasing recursive function. 

Prove that every infinite semi-recursive set has an infinite recursive sub- 
set. 

(Hint. Effectively define a strictly increasing subsequence.) 

Prove that there is an m € BR such that W, infinite implies that Winx) C 
W,. andW,,, x) is an infinite recursive set. 

(Hint. Use Exercise I.91 and 1.96.) 

Prove that there is an / in $B such that W, MN Wy = Wac,,yy (all x, y). 
Prove that there is an k € $BR such that W, U Wy = Wicx,y) (all x, y). 
Prove that there is an g € $B such that W, x Wy = Werx,yy (all x, y). 


1.101. 


1.102. 
1.103. 


1.104. 


1.105. 


1.106. 
1.107. 


1.108. 
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Prove that 8 is not closed under min by finding a function f € 5B such 
that 


4 minf{y : f(x, y) = 0} if min exists 
ai 4 otherwise 


is total and 0-1-valued, but not recursive. 

(Hint. Try f(x, y) that yields 0 if (y = OA {x}(x) 1) V y = 1 and is 
undefined for all other inputs.) 

Prove that A.x.{x}(x) has no (total) recursive extension. 

Express the projections K and L of J(x, y) = (x + y)? + x in closed 
form — that is, without using (wy), or bounded quantification. 

(Hint. Solve for x and y the Diophantine equation z = (x + y)? + x. 
The term |,/z| is involved in the solution.) 

Prove that the pairing function J(x, y) = (x + y)\(« + y+ 1)/24 x is 
onto (of course, the division by 2 is exact), and find its projections K 
and L in closed form. 

(Hint. For the onto part you may convince yourself that J enumerate- 
spairs as follows (starting from the Oth pair, (0, 0): It enumerates by 
ascending group number, where the group number of the pair (a, b) is 
a+ b. In each group it enumerates by ascending first component; thus 
(a, b) is ath in the group of number a + b.) 

Find a polynomial onto pairing function via the following enumera- 
tion: Enumerate by group number. Here the group number of (a, b) is 
max(a, b), that is,a — b +b. In group i the enumeration is 


(0,7), (1,7), @,1),...,@-Ld,@),@i- D,@i—- 2), 
(i,i —3),...,@, 1), @ 0) 


Find also the projections K and L in closed form. 

By the way, what makes this J “polynomial” is that (like the one in Exer- 
cise I.103 above) it does not involve division. It only involves +, x, = 
and substitutions. 

Prove that AX, .(X,), where (...) is that defined on p. 165, is in PBR. 
Prove that if t is a closed term of Lm, then 


ROBE t =1t™ 


(Hint. Induction on terms.) 

Prove, by constructing an appropriate model, that (Ax)x < x is consis- 
tent with ROB, and therefore ROB -x < x. 

(Hint. For example, you can build a model on the set N U {00}, where 
“oo” is anew symbol (not in N).) 
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1.109. Prove, by constructing an appropriate model, that (Ax)(Ay)x +y 
y +x is consistent with ROB, and therefore ROBY x + y=y4x. 

1.110. Prove that x x y = z is ROB-definable. 

L111. Prove 1.9.45. 

1.112. Complete the proof of 1.9.46. 

1.113. Prove that if A C N is positively strongly formally definable in ROB — 
that is, for some 4 and all n,n € A iff ROB + /4(n) — then it is semi- 
recursive. What can you say if we drop “positively”? Is the converse true? 

1.114. Is either of the two sets in I.9.55 recursive? Why? 

1.115. Prove that if is a complete extension of ROB and {".4':.4 € T} is 
recursive, then I’ has a decidable decision problem, i.e., {".41: PF .4} 
is recursive. 

1.116. Gédel’s first incompleteness theorem — original version. Prove that if T 
is a w-consistent extension of ROB and {".4': .4 € T} is recursive, 
then I’ is incomplete. 

(Hint. This is a suggestion for “proof by hindsight’, using recursion- 
theoretic techniques (not Gédel’s original proof). So, prove, under the 
stated assumptions, that every semi-recursive A C N is positively strongly 
definable (cf. Exercise I.113) in T (@-consistency helps one direction). 
Thus, the halting set K is so definable. What does this do to Thm? Etc.) 


We explore here some related formal definability concepts for functions. 


© We say that a total AX,,. f (X,) is formally functionally definable (in some exten- 
sion of ROB, I’) iff for some .Y(y, X,,) the following holds for all b, a, in N: 


b= f(Gn) implies T+ A(y,a,...,d,) oy =b % 


1.117. Prove that if lis a consistent extension of ROB (for example, ROB itself 
or a conservative extension), then, in the definition above, the informal 
“implies” can be strengthened to “iff”. 

1.118. Prove that a total AX, .f(X,) is formally functionally definable (in some- 
extension of ROB, I) iff for some .7(y, ¥,) the following hold for all 
b, a, in N: 

(i) The graph of f —AyXn.y = f (Xp) — is formally defined in T by. 7 
in the sense of 1.9.38, and 
(ii) b = f(G,) implies + .F(y,a},.--,dn) ey = b. 

1.119. Actually, the above was just a warm-up and a lemma. Prove that a total 
f is formally functionally definable in ROB (or extension thereof) iff 
its graph is just definable (1.9.38). 

(Hint. For the hard direction, let (using a single argument for notational 
convenience) .Y(x, y) define y = f(x) in the sense of 1.9.38. Prove that 
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GY (x,y), a short name for 
F(X,yVANWZ < y)AF(x,Z) 
also defines y = f(x) and moreover satisfies 
if f(a)=b thn + ¥YG@,y)>y=b 


To this end assume f(a) = band prove, first, that+ Ya, y) + y < Sb, 
and second (using I.9.44) thath y < Sb— Ga,y)>y= b.) 

1.120. Let the total f of one variable be weakly definable in ROB (or extension 
thereof), that is, its graph is formally definable in the sense of 1.9.38. Let 
_€(x) be any formula. Then prove that, for some well-chosen formula 


B(x), 
(Va EN) + 4G) o Af) 


(Hint. By Exercise 1.119, there is a.¥(y,x) that functionally defines f 
((1) above). Take for 2 the obvious: (Ay)(¥(y, x) A.4(y)).) 

1.121. Let A C Nbe positively strongly definable in ROB, and the total Ax. f(x) 
be functionally definable in ROB. Prove that f~'[A], the inverse image 
of A under f, is also positively strongly definable. Give two proofs: one 
using the connection between ROB and recursion theoretic concepts, 
the second ignorant of recursion theory. 

(Hint. Use Exercise 1.120.) 


We have used the formula (Avo)(vp = 1 A.F(v9)) — which is logically equiv- 

© alent to .F(7) by the one point rule (1.7.2) — on a number of occasions (e,g., 
p. 175), notably to “easily” obtain a Gédel number of (a formula equivalent to) 
F (n) from a Godel number of .F (vo) and n. This Gédel number is (pretending 
that A is a primitive symbol so that we do not obscure the notation) 


(BYP (AY = 1,509, H), Fo) }) 


The above, with n as the only variable, is recursive (indeed, primitive recursive). 
So trivially, is, the function s of two (informal) variables n and x over N: 


s= Anx("37," v9 |, (r A”, (T = tig lst yx) 
Clearly, 
s(n," F (vo) ') =" (Avo)(vo = 1 A.F(v9))" 


A fixed point, or fixpoint, of a formula .A(vp) in an extension of ROB, I, is a 
sentence .Y such that 


Th¥o MF S 
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1.122. 


1.123. 


1.124. 


1.125. 


1.126. 


1.127. 
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Let ROB < I and. (vo) be any formula in the language of I’. Prove 
that .4 has a fixpoint.Y inT. 

(Hint. The function s above is recursive; thus so is D = dx.s(x, x). 
Therefore, D is functionally definable in I’. Now use Exercise I.120 
with f = D. See if you can use a “good” a € N.) 

Tarski’s “undefinability of truth” again (cf. 1.9.31). Prove that T = 
{".41 : 4 € F (M} is not (semantically) definable in Nt, basing the 
argument on the existence of fixpoints in ROB and on the latter’s cor- 
rectness. 

(Hint. Suppose that some Ly formula, say .4(vo), defines T. Use the 
previous exercise to find a sentence that says “I am false”.) 

Let us use the term strongly 1-recursive functions for the smallest set of 
functions, 9’, that includes the initial functions of 8% and the functions 
Axy.x + y, Axy.xy, and Axy.x — y, and is closed under composition 
and (jy) applied on total, regular functions AX y.g(X, y), that is, total 
functions satisfying 


(Vx)Gy)g@, y) =0 


Prove that SR’ =f. 

(Hint. Use coding and decoding, e.g., via the 6-function, to implement 
primitive recursion.) 

Prove (again) that all recursive functions are functionally definable in 
ROB. Do so via Exercise I.124, by induction on ®’, without using the 
separation lemma (1.9.49). 

(Craig.) Prove that a recursively enumerable set of sentences .7 over a 
finitely generated language (e.g., like that of arithmetic) admits a recur- 
sive set of axioms, i.e., for some recursive [,.7 = Thmp. 

(Hint. Note that for any .4 € .7, any two sentences in the sequence 


4 


4 Y Y Y 4 
46, J6 IN 0, JON VON Oye 


are logically equivalent. Now see if Exercises I.94 and I.95 can be of 
any help.) 

Let { be the pure theory over the language that contains precisely the 
following nonlogical symbols: One constant, one unary function, two 
binary functions, and one binary predicate. Prove that { has an undecid- 
able decision problem, that is, {".41: 4 € {} is not recursive. 


I 


The Second Incompleteness Theorem 


Our aim in the previous section was to present Gédel’s first incompleteness 
theorem in the context of recursion theory. Much as this “modern” approach is 
valuable for showing the links between unprovability and uncomputability, it 
has obscured the simplicity of Gédel’s ingenious idea (as it was carried out in 
his original paper (1931)). 

What he had accomplished in that paper, through arithmetization of for- 
mulas and proofs, was to build a sentence of arithmetic, .7, that said “I am 
not a theorem”. One can easily prove, metamathematically, that such an .¥ is 
undecidable, if arithmetic is w-consistent. 

To see this at the intuitive level, let us replace w-consistency by correctness. 
Then surely .Y is not provable, for if it is, then it is a theorem, and hence false 
(contradicting correctness). 

On the other hand, we have just concluded that .Y is true! Hence, =F is 
false, and therefore not provable either (by correctness). 

This simple application of the “liar’s paradox”? is at the heart of the first 
incompleteness theorem. 

Imagine now that the arithmetization is actually carried out within (some) 
formal arithmetic, and that with some effort we have managed to embed into 
formal arithmetic the metamathematical argument that leads to the assertion “if 
arithmetic is consistent, then I’ .7 +. The quoted statement is formalized by 
“Con > .¥ ”, where “Con” is some (formal) sentence that says that arithmetic 
F is not a theorem”. 


99, 66 


is consistent. This is so, intuitively, since .¥ “says 


¥ Attributed in its original form to Epimenides of Crete, who proclaimed: “All Cretans are liars”. Is 
this true? Gédel’s version is based on the variation: “I am lying”. Where does such a proclamation 
lead? 

= Where, of course, “+” is using the nonlogical axioms of our formal arithmetic. 
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It follows that Ff Con (why?). 


This is Gédel’s second incompleteness theorem, that if a recursively axioma- 
tized theory is a consistent extension of (Peano) arithmetic, then it cannot prove 
the sentence that asserts its own consistency. 


In order to prove this theorem we will need to develop enough formal arith- 
metic to be able to carry out elementary arguments in it. We will also need to 
complete the details of our earlier arithmetization, within formal arithmetic this 
time. 


II.1. Peano Arithmetic 


We start by extending ROB with the addition of the induction axiom schema, to 
obtain Peano arithmetic. Within this arithmetic we will perform all our formal 
reasoning and constructions (arithmetization). 

As a by-product of the required extra care that we will exercise in this 
section, regarding arithmetization details, we will be able to see through some 
technicalities that we have suppressed in I.9.33—-1.9.34 and 1.9.37. For example, 
we had accepted there, without explicitly stating so, that our standard rules of 
inference, modus ponens and (J)-introduction, are recursive, and therefore fit 
the general description of the unspecified rules I, and I, on p. 176. 

We will soon see below why this is so. Similarly, we have said that A is 
recursive. At the intuitive level this is trivial, since some logical axioms are 
recognized by their form (e.g., the axiom x = x), while the tautologies can be 
recognized by constructing a truth table. While this ability to recognize tau- 
tologies can be demonstrated rigorously, that is far too tedious an undertaking. 
Instead, we opt for a more direct avenue. In the exercises (Exercises I.26—I.41) 
we led the reader to adopt a finite set of axiom schemata in lieu of the infinitely 
many schemata whose instances are tautologies. This makes it much easier to 
see that this amended A is recursive (even primitive recursive). 


II.1.1 Definition (Peano Arithmetic). We extend ROB over the same lan- 
guage, Ly, by adding the induction axiom (in reality, an induction schema), 
Ind, below: 


Ax — 0] A (Vx). 4. 4[x — Sx]) —.4 (Ind) 
We often write more simply! 


A[0] A (Vx) 4[x] — -4[Sx]) — 41x] 


+ Cf p. 33. 
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or just 


AL0] A (Vx)C-4  -4[Sx]) 3 4 


The theory ROB + Ind is called Peano arithmetic, for short PA. 


II.1.2 Remark (A Note on Nonlogical Schemata and Defined Symbols). 
Metatheorems I.7.1 and 1.7.3 ensure that the addition of defined predicates, 
functions, and constants to any language and theory results in a conservative 
extension of the theory, that is, any theorem of the new theory over the old 
(original) language is also provable in the old theory. Moreover, we saw how 
any formula. 4 of the new language can be naturally transformed to a formula 
_@* of the old language (eliminating defined symbols), so that 


Pe (1) 


is provable in the extended theory. 
There is one potential worry about the presence of nonlogical schemata — 
such as the induction axiom schema of PA — that we want to address: 


First off, while logical axioms are “good” over any first order language — they 
are “universal” — nonlogical axioms and schemata on the other hand are specific 
to a theory and its basic language, i.e., the language that existed prior to any 
extensions by definitions. Thus, the induction schema is an “agent” that yields a 
specific nonlogical axiom (an instance of the schema) for each specific formula, 
over the basic language Ls, that we care to substitute into the metavariable . 4. 

There is no a priori promise that the schema “works” whenever we replace 
the syntactic variable .4 by a formula, say “.#”, over a language extension 
obtained by definitions. By “works”, of course, we mean that the produced 
schema instance is a theorem in the extended theory. 


Well, does it “work’’? Indeed it does; for let us look at the formula 
Blix — 0 AWx).2R Blix — Sx])- FB (2) 


where the particular formula .2 may contain defined symbols. Following the 
technique of symbol elimination (cf. Remark I.7.4(a), p. 120) — eliminating at 
the atomic formula level — we obtain the following version of (2), in the basic 
language of PA. This translated version has exactly the same form as (2) (i.e., 
Ind), namely 


Bx — OA (Wx) B* > B*[x — Sx]) > B* 


Thus — being a schema instance over the basic language — it is an axiom of PA, 
and hence also of its extension (by definitions). Now, by (1), the equivalence 
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theorem (Leibniz rule 1.4.25) yields the following theorem of the extended 
theory: 


(B[x — 0) A (Vx)(B — Bx — Sx]) > Z) 
o 


(B*[x — OA (Vx)(-2* > B*[x — Sx]) > B"*) 


Hence (2) is a theorem of the extended theory. © 


S 11.1.3 Remark (The Induction “Rule’’). In practice, instead of Ind, we usually 
employ the (derived) rule of inference Ind’ that we obtain from Ind by invok- 
ing modus ponens and the duo I.4.7-1.4.8:' 


A(x —0],.4-.4[x — Sx]F.4 (Ind’) 


The rule is normally applied as follows: We ascertain that the premises apply 
by 


(1) proving .4[0] (this part is called the basis of the induction, just as in the 
informal case over N), and 

(2) adding the induction hypothesis (I.H.) .4 to the axioms, treating the free 
variables of .4 as new constants! until we can prove .4[ $x]. 


We then have a proof of .4 — .4[$x] by the deduction theorem. 
Ind’ now allows us to conclude that .4 has been proved by induction on x. 


What is interesting is that Ind’ implies Ind; thus the two are equivalent. 
What makes this interesting is that while the deduction theorem readily yields 
Ind from Ind’, it does so under the restriction that the free variables in .Z[0] 
and (Vx)(.4—.4[Sx]) must be treated as new constants (be “frozen’”). We 
can do without the deduction theorem and without the restriction. 

Let us then fix a formula .4 and prove Ind assuming Ind’ (see, e.g., 
Shoenfield (1967)). 

We let! 


B= A0) A (VWx)-4 —. 4[Sx]) 3.4 
To prove .#, using Ind’, we need to prove 


BO) (1) 


+ Throughout this chapter, the symbol + implies a subscript, PA, or some extension thereof, unless 
something else is clear from the context. 

+ That is, disallowing universal quantification over, or substitution in the variables. Of course, 
existential quantification is always possible by Ax2. 

8 Here “=” means equality of strings; cf. 1.1.4. 
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and 
B— BSx] (2) 
Now, (1) is a tautology, while (2) is tautologically equivalent to 


(40) A (VWx)-4  -4[Sx]) A 74) Vv 


j a p (3) 
AO] V A(Wx). 4. ASX) V ASX] 


In turn, (3) — after distributing V over A —is seen to be tautologically equivalent 
to 


A[0] — (Vx) 4 > -4[Sx]) — 4 -4[Sx] 


which is provable by tautological implication and specialization (1.4.7). © 


For the next little while we will be rediscovering arithmetic in the formal 
setting of PA. In the process, new predicate and function symbols will be 
introduced to the language (with their attendant axioms — as in Section I.7). 


11.1.4 Definition. We introduce the predicate < by x C ywox<yVx=y. 


For the balance of the section / means Fp, unless noted otherwise. 
Of course, in the those instances where we add axioms (in order to argue by 
the deduction theorem, or by auxiliary constant, or by cases, etc.) by a sentence 


such as “Let.4...” or “Add .4...”, “F” will mean provability in the aug- 
mented theory PA + .#. S 


11.1.5 Lemma. | 0 < x. 


Proof. We use induction on x.' For convenience, we let. 4=0< x. 
Basis. + .4[0], since 0 = 0 Eqaut ~4[0] by II.1.4. 
LH. Add.4# 


Now, .Z[Sx] = 0 < Sx V0 = Sx; thus, by < 2 (p. 175) and the Leibniz 
tule (1.4.25, p. 51), 4[Sx] -~.4V 0= Sx. We are done by taut and IH. 


11.1.6 Lemma (Transitivity of <). Fx <yAy<z—->x <z. 


t This, usually, means that we are using the induction rule, Ind’, rather than the induction schema 
itself. 

t Instead of “add”, we often say “let”. Either means that we are about to invoke the deduction 
theorem, and we are adding a new nonlogical axiom, with all its free variables “frozen”. 
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Proof. Induction on z. Lett. 4[zJ=x <yAy<z7x <z. 
Basis. t+ .4[0] from < 1 and Equut. 
LH. Add.4@. 


Addx <yAy < Sz (to prove x < Sz). This yields (by < 2, the Leibniz 
rule, and taut) 


x<yAy<zVx<yAy=z (1) 
We also have (the I.H.).4: 
x<yAy<cz7rrx<z 
and (Leibniz equality axiom) 
x<yAyrrrx<z 


Thus, using taut, ¥ < z follows from (1). This last conclusion, by F taut and 
< 2, yields x < Sz. 


II.1.7 Corollary (Irreflexivity of <). Fk mx < x. 


Proof. Induction on x. 
Basis. By <1. 
LH. Add 7x <x. 


We want to deduce =Sx < Sx. Arguing by contradiction, we add Sx < Sx, 
that is, via < 2, 


+ Sx <xVSx=x (1) 
Now, x < Sx by < 2, the axiom x = x, and E taut. Thus, using (1), 
Ex < Sx A(Sx <x V Sx =x) 


which yields x < x by II.1.6 and the Leibniz equality axiom (Ax4 of A), 
contradicting the I.H. 


© Intuitively, “<” is a (strict) order (irreflexive and transitive). Thus the induction 
axiom has a net contribution to ROB (cf. Exercise I.108). 


II.1.8 Lemma (Antisymmetry of <). Fx <yAy<x—-x=y. 


Proof. Assume the hypothesis 
x<SyAy<x 


+ That is, “z” is our favourite free variable in the formula — see I.1.11, p. 18. 
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By II.1.4 this is tautologically equivalent to 


Ax<yAy<xyr>rnw<yAx=y) 
> Aly <xAx=y)—-x=y 


SinceF A(x << yAy < x) by transitivity and irreflexivity (e.g., use proof by 
contradiction for this sub-claim) while each of 


ax<yAx=y) 
and 
ay<xAx=y) 


are theorems by the Leibniz axiom and irreflexivity, x = y follows by modus 


ponens. 


11.1.9 Lemma. | x < y — Sx < Sy. 


Proof. Induction on y. 
Basis. -x <0— Sx < SO, by < Land Emut. 
LH. Addx <y — Sx < Sy. 
Add x < Sy towards proving Sx < SSy. 
HenceFk x < y Vx = y, by < 2. By I.H. and the Leibniz equality axiom, 
F Sx < Sy V Sx = Sy 
Therefore F Sx < SSy by < 2. 


I1.1.10 Corollary. Fx << yo Sx<y. 


Proof. >: | Sx <y <— Sx < Sy, by < 2. 


<:Byt x < Sx and transitivity. 


II.1.11 Proposition. Axiom < 3 is redundant in the presence of the induction 
axiom. 


Proof. Exercise II.1. 
& We give notice that in what follows we will often use x > y tomeany < x. 


We now state without proof some “standard” properties of + and x. The 
usual proof tool here will be induction. The lemma below will need a double 
induction, that is, on both x and y.! 


+ Actually, if you prove associativity first and also the theorem S0 + x = Sx, then you can prove 
commutativity with a single induction, after you have proved 0 + x = x. 
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1.1.12 Lemma. } x+ y=y+4rx. 


Proof. Exercise IL.3. 


1.1.13 Lemma. + x + (y +z) = («+ y) +z. 


Proof. Exercise IL.4. 


111.14 Lemma. } x xX y=y Xx. 


Proof. Exercise IL5. 


11.1.15 Lemma. | x x (y X Z) = (x X y) X Z. 


Proof. Exercise IL.6. 


I1.1.16 Lemma. | x xX (y +z) =( X y) + (@ X Z). 


Proof. Exercise II.7. 


© We adopt the usual priorities of arithmetic operations; thus, instead of 
xX (y +2) = (& X y) + & X 2) 
we will normally use the argot 
xX(ytz=xxytxxz 


We will adopt one more useful abbreviation (argot): From now on we will 
write xy for x X y (implied multiplication notation). Moreover we will often 
take advantage of properties such as commutativity without notice, for example, 


writing x + y fory + x. 7 
1.1.17 Lemma. f x <y —~x+z< yz. 


Proof. Induction on Z. 
Basis. From +1. 
LH. Addx <yr>x+z<ytz. 


Letx < y. By LH, x+z< y+z. By ID1.9,F S(vx +z) < S(y +2). 
By +2,F x + 8z<y + Sz. 
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1.1.18 Corollary. Fx +z<y+z—ox<y. 


Proof. Let x+z< y+z. Add —=x < y, which, by < 3, tautologically im- 
plies 

y<xVy=x (1) 
Since y <x Sa y+z<x4+2 (IL1.17) and y=x —-y4+z=x42 
(Leibniz axiom), taut and (1) yield 

rFy+z<x+z 


Along with the original assumption and transitivity, we have just contradicted 
irreflexivity. 


I1.1.19 Corollary. + z >0— Sx <x +z. 


Proof. We have 0<z—-0+%x <z+ x. Commutativity of +, and +1, 
lead to the claim. 


1.1.20 Lemma. + z > O0Ax <y— xz < yz. 


Proof. Induction on Zz. 
Basis. By <1. 
LH. Addz>0Ax <y—xz < yz. 
Let now 
x<y (1) 
Ast 0 < Sz anyhow (< 2 and II.1.5), we embark on proving 
xSz < ySz 
By x2 (and the Leibniz axiom, twice) the above is provably equivalent to 
xtxuz<ytyz (2) 
so we will just prove (2). 
By II.1.5 there are two cases for z./ 
Case z =9. Thus, by the Leibniz axiom and x1, 
rxz=0 
and 


Fyz=0 


* Cf. p. 51. 
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By (1), IL.1.17, and the Leibniz axiom," 
rx+xz<yt+yz 
Case z > 9. First off, by IH. and II.1.17, 
ry+xz<yt+yz 
Also, by IL1.17, 
Hx +xz<y+xz 


These last two and transitivity (IL.1.6) yield 


rx+xz<yt+yz 


1.1.21 Corollary (Cancellation Laws). 
rFxe+z=y+z—-x=y 


rz>0Axz=yz->x=y 


Proof. By < 3, IL.1.7, I1.1.17, and IL.1.20. 


1.1.22 Corollary. 
rFx<yAz<woxt+7z<ytw 


rx<yAzcwox<yw 


Proof. By 11.1.6, 111.17, and IL.1.20. 


II.1.23 Theorem (Course-of-Values Induction). For any formula .7, the fol- 
lowing is provable in PA: 


(Vx)(Vz < x). Az] 4)  (Vx).4 (1) 


© It goes without saying that z is a new variable. Such annoying (for being ob- 
vious) qualifications we omit as a matter of policy. 


In practice, the schema (1) is employed in conjunction with the deduction 
theorem, as follows: One proves .4 with frozen free variables, on the I.H. that 


1 This “FH” is effected in the extension of PA that contains the LH. and alsox < y andz = 0. 
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“(Vz < x)4[z] is true” (i.e., with the help of the new axiom (Vz < x). 4[z]). 
This is all one has to do, since then (Vz < x). 4[z] — .4 (deduction theorem) 
and hence (generalization) (Vx)((Vz < x). 4[z] —.4). 

By (1), (Vx).4 now follows. © 


Proof. To prove (1), we let the name .# (or .#[x], since we are interested in 
x) stand for (Vz < x). 4[z], that is, 


B=(Vz)RZ <x 3. 4[z)) 


Proving (1) via the deduction theorem dictates that we next assume (1)’s 
hypothesis, that is, we add the axiom 


(Vx)(2 >. 4)[G] 


where ¢ are new distinct constants, substituted into all the free variables of 
(Vx)( 2 — 4). The above yields 


+ .#[x,€]—.4[x,c] (2) 
Our next (subsidiary) task is to establish 
+ (Vx).A[x,é] (3) 


by induction on x. 
Basis. + .2[0,@] by Evaut and < 1. 
LH. Add.#[x,@], with frozen x, of course.t By (2), 


+ .A[x,é] (4) 
Now, .7?[Sx,¢] = (Vz < Sx). 4[z,¢]; thus 
BSx,€] 
o (by the Leibniz rule, E taut and < 2 —V distributes over A (Exercise 1.23)) 
(Vz)(z <x. 4[z,€]) A (Vz)(Z = x 3. Az, €)) 
© (17.2,) 
Bx, €)A Ax, é] 


7 A constant. But we will not bother to name it anything like “c’”. 
¥ y is still frozen. 
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Pause. We applied the style of “equational proofs” or “calculational proofs” 
immediately above (chain of equivalences). The “<>” is used conjunctionally, 
that is, “Fk Y, <> Dy... <> D,” means “F (D; + Dz) A+++ A (Dao 
D,,)”’; hence, by tautological implication, / D; = Dy. 


By (4) and the IH. we have proved .#[Sx,¢]. Hence our induction has 
concluded: We now have (3). 

By (2), (3), V-monotonicity (1.4.24), and modus ponens we infer (Vx). Z[x, ¢]; 
hence, by the deduction theorem, 


(Vx)(B > 4)E] — (Vx). AZ[x, €] 


Applying the theorem on constants, we get (1). 


We have applied quite a bit of pedantry above during the application of the 
deduction theorem, invoking the theorem on constants explicitly. This made it 
easier to keep track of which variables were frozen, and when. S 


II.1.24 Corollary (The Least Principle). The following is provable in PA for 
any .#: 


(Ax). 4 — (Ax)(4A (Vz < x) 7-4[z]) (LP) 


Proof. The course-of-values induction schema applied to —.4 is provably 
equivalent to (LP) above. 


We now have enough tools to formalize unbounded search, (ty), in PA (cf. 
1.8.8, p. 127). 


Suppose that we have! 
- (Ax).4 (E) 


Informally this says that for all values of the free variables there is a (corre- 
sponding value) x that makes .4 true. In view of the least principle, we must 
then be able to define a “total function on the natural numbers” which, for each 
input, returns the smallest x that “works”. Formally this “function” will be a 
function letter — introduced into the theory (PA) by an appropriate definition 


+ Cf. Dijkstra and Scholten (1990), Gries and Scheider (1994), Tourlakis (2000a, 2000b. 2001b). 
= Reminder: We continue using “+” for “+7”, where .7 is PA, possibly extended by definitions. 
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(Section I.7) — whose natural interpretation in St will be the total function we 
have just described. The formal details are as follows: 


Let .#% stand for 
AN(WTZ <x) 2. A[z] (B) 
By the least principle and (£) (existence condition), 


+ (Ax). 2 (1) 


Pause. By 4-monotonicity (1.4.23), (1) implies (£), since F .@ —.4. Thus 
(1) and (E) are provably equivalent in PA, by II.1.24. 


We next show that the “3” in (1) is really “S!”. To this end, we prove 
BIxIN Aly] >x=y (2) 
Add .2[x] and .#[y] (with frozen free variables), that is, add 
Ax] A (Vz <x) 74 4[z] 
and 
ALI AW < y) 7-4] 


These entail 


A[x] (3) 
Z<x > 71.Alz] (4) 
AY] (5) 
z<yo 7 4z] (6) 
We will now show that adding 
x<yVy<x (7) 


will lead to a contradiction, therefore establishing (by < 3) 
x=y 
Having added (7) as an axiom, we now have two cases to consider: 
Casex <y. Then (6) yields —. Z[x] contradicting (3). 
Casey <x. Then (4) yields —. ZLy], contradicting (5). 
We have established (2). 
We can now introduce a new function symbol, f, in Ly by the axiom 


f¥H=XOB (f) 


where the list ¥, x contains all the free variables of .4 — and perhaps others. 
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Pause. Is “and perhaps others” right? Should it not be “exactly the free vari- 
ables of .2”? 


The “enabling necessary condition” for (f) is, of course, (1) — or, equiv- 
alently, (EZ) — above. We will always speak of (E) as the enabling existence 
condition. 

The following alternative notation to (f) above is due to Hilbert and Bernays 
(1968) and better captures the intuitive meaning of the whole process we have 
just described: 


S¥ = (ux).4 (f’) 
or, using the Whitehead-Russell “ve”, 
(ux). 48 (Lx). (f") 


Thus, we can always introduce a new function symbol f in PA by the explicit 
definition (f’) as long as we can prove (E) above. 


Note that, by I.7.1 and I.7.3, such extensions of PA are conservative. 


We can also say that we have introduced a p-term, (ux). 7%, if we want to 


suppress the details of introducing a new function letter, etc. & 
Axiom (f) yields at oncet 
+ ALFY] eae.) 
therefore, by (B), < 3, and Equut, 
b ASF] (fF) 
and 
F Az] — fy <z Ge) 


6699 


Here “yz” is the formal counterpart of the “jw” (unbounded search) of non- 
formalized recursion theory (1.8.8). We have restricted its application in the 
formal theory so that functions defined as in (f’) are total (due to (£)). 

We will also need to formalize primitive recursion, that is, to show that given 
any function symbols g and h of Ly, of arities n + 2 and n respectively, we 
can introduce a new function symbol f of arity n + 1| satisfying the (defining) 
axioms 


SY: = hyn 
Sf SxX¥n = BXIYnfxXYn 


+ Where the machinery for “+” includes the defining axiom (f). 
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Needless to stress that x, y,, are free variables. Note that the pair of equations 
@ above generalizes the manner in which + and X were introduced as primeval 

symbols in ROB and PA. 
More “user-friendly” (argot) notation for the above recurrence equations is 


fO, Yn) = An) 
S(Sx, Yn) = 8X5 Yn, fn) @ 


To be able to handle primitive recursion we need to strengthen our grasp of 
“arithmetic” in PA, by developing a few more tools. 

First off, for any term t we may introduce a new function symbol f by a 
“definition” (axiom) 


fy=t (8) 
where ¥ contains all the free variables of ¢ (but may contain additional free 


variables). 


(8) is our preferred short notation for introducing f. The long notation is by 
quoting 


+ (Ax)x =t (8') 
andt 
SY = (ux)x =t (8”) 
Of course, the enabling condition (8’) is satisfied, by logical axiom Ax2. er 


We next introduce the (formal) characteristic function of a formula, a formal 
counterpart of the characteristic function of a relation (1.8.16). 

Let .4 be any formula, and ¥, the list of its free variables. We introduce a 
new n-ary function symbol x. 4 by the explicit definition 


X.1In = (ux 4Ax =0V24Ax=1) (C) 


As always, we must be satisfied that the enabling condition 
t (Ax 4Ax =0VA4Ax=1) (C’) 
holds. Well, since + .4 V —.4, we may use proof by cases. 
Case.4: Now, .4A0=0V 4A 0=1, thus (C’) follows. 


Case3.4. Thistime+.4A1T=0V 3.4A1=1, thus (C’) follows once 
more. 


¥ [shall eventually stop issuing annoyingly obvious reminders such as: “x is not, of course, chosen 
among the variables of t”’. 
' Cf p.51. 
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II.1.25 Remark (Disclaimer). “/7Jhe (formal) characteristic function of a for- 
mula” (emphasis added) above are strong words indeed. Once a x_z has been 
introduced as above, one may subsequently introduce a new function symbol 
f by the explicit definition f¥, = x. ¥n, and this too satisfies (C) and its 
corollaries (9) and (10) below. For example, .4 <> fy, = 0. 

Thus, “the” characteristic function symbol is in fact not unique.’ Never- 
theless, we may occasionally allow the tongue to slip. The reader is hereby 


forewarned. cr 


oo. Remark (About “+”, Again: Recursive Extensions of PA). From 
now on, proofs take place in an (unspecified) extension of PA effected by a 


finite sequence of js-definitions or* definitions of new predicate symbols. To 
be exact, we work in a theory PA’ defined as follows: There is a sequence of 
theories {;, fori = 0,..., each over a language Ly,, such that 


(i) La, = Ly», Zo = PA, Bie = PA’, and 
(ii) foreachi = 0,...,m — 1, &;41 is obtained from {; by 
(a) adding a single new function symbol f to Ly,, to obtain Ly,,,, and 
adding the axiom 


LY) = (wx). 4 (F) 


to {;, having shown first that Fs, (Ax), or 
(b) adding a single new n-ary predicate symbol P to Ly,, to obtain Ly 
and adding the axiom! 


i+1? 


PX, A(X) (P) 
to f;. 


We will restrict from now on the form of function and predicate defini- 
tions (Ff) and (P) above, so that in each case the formula .4 is in Ag(Ly) 
(see 1.9.27), where Ly is the language to which the new symbol is being 
added. 

Under the above restriction on .4, we call any extension PA’ of PA, as 
described in (i)-(ii) above, a recursive extension (Shoenfield (1967), 
Schwichtenberg (1978)).4 © 


t Its interpretation, or “extension”, in the standard structure is unique, of course. 

= Inclusively speaking. 

8 Such a definition effected the introduction of “<” in II.1.4. 

1 Actually, those authors require .4 to be an open formula. For the purposes of Theorem II.4.12 
later on, the two formulations make no difference. 
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One can now show that 


b. 4Ox1In =0 (9) 
and 
bA4exeinr=l (10) 
For (9), add _4,' and prove 
X4In =9 (11) 


We have (cf. (f), p. 218) 
+. 4NX 290 =0VA4ZAX ZI =I 
Since 
4A, AKX2¥94 =0V A4ANX4 In =1 Etat X49 =0 

(11) follows. In short, we have the — -direction of (9). Similarly, one obtains 
the — -direction of (10). 

A by-product of all this, in view of tautological implication and +.4 V —.4, 
is 

Lx.2In = OV X¢In=T 


The latter yields the < -directions of (9) and (10) via proof by cases: Say, 
we work under the case v4 ¥, = 0, with frozen y,,. Add =. 4. By the — -half 
of (10), 


- X4 Vn = 1 
hence, using the assumption, we obtain F 0 = T which contradicts axiom $1. 


Thus .%, and hence the < -half of (9) is proved. One handles (10) similarly. 


Definition by cases. We want to legitimize definitions of new function symbols 
f such as 


ty if.4, 
fi =y : (12) 
th if.4, 
where 
F .4,V ee VAG, (13) 
F AC, A.4;) for alli A j (14) 


+ With frozen ¥n, of course. 
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and X;, is the list of all free variables in the right hand side of (12). Our under- 
standing of the informal notation (12) is that 


b.4,—> fk, = tj (15) 


holds fori = 1,...,k. We (formally) achieve this as follows: 


Let x;X, be thet characteristic termof.74, V +++ V 4, V --+ V@,, where 


4, means that .4, is missing from the disjunction. Then the definition below, 
in the style of (8) (p. 219), is what we want (cf. I.8.27): 


S¥n = XIX +s UX EKn (16) 


Indeed, .4, > 704, V --- V 4; V --» V4), by (14) and taut; hence 


b .4,>xi%n =1 (17) 
On the other hand,' 
+ .4,—-xXj¥n = 0 for j #i (18) 
Elementary properties’ of + and x now yield (15). ® 


We next define the formal proper subtraction function(-symbol), 6. Infor- 
mally,‘ 6(x, y) stands for x — y. In fact, its natural interpretation — as it is sug- 
gested by the right hand side of (19) below — in 3t’ = (N; 0; S, +, x; <;...) is 
x—y. 

We set, exactly as in informal recursion theory 


6(x,y) = (uzy\x =y+zVx<y) (19) 
To allow (19) stand, one must show that 
F (Az)\x =y+zVx<y) (20) 


By < 3, one applies proof by cases. If x < y is the hypothesis, then, by Ax2 
and modus ponens, (20) holds. 


+ 


For the use of the definite article here — and in similar contexts in the future — see the dis- 
claimer II.1.25, p. 220 

t Sincek .4, > .4,V--: V.4,V +++ VG for j Fi. 

+1, x1land x2, associativity, commutativity. 

1 When it comes to formal counterparts of known number-theoretic functions we will abuse the 
formal notation a bit, opting for argot in the interest of readability. Here we wrote “6(x, y)” rather 
than the nondescript “ fPxy”, where fF is the first unused (so far) function symbol of arity 2. 
See also p. 166. 


wr 
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For the remaining case we do induction on x to prove 


Kx >y—> (zx =ytz (21) 


Pause. What happened to “V x < y’? 
Basis. For x = 0, if y = 0 (cases!), then z = 0 works (via Ax2) by +1. 


Otherwise (y 4 0), and we are done by < 1. 
For the induction step we want 
Sx > y — (az)Sx =y4+zZz (21’) 


Let Sx > y. This entails two cases: 


If Sx = y, then z = 0 works. 
Finally, say Sx > y. By < 2, we have x > y. By LH., 


f (Az)x =y+z (21”) 


Add x = y + a to the hypotheses, where a is a new constant.’ By +2, we have 
- Sx = y + Sa; hence F (Az)Sx = y + z, and we are done proving (20). 
We have at once 


1.1.27 Lemma. | x < y— 6(x,y)=0, and x >y—-x=y4+ 6%, y). 
Also, 


II.1.28 Lemma. For any z, + z6(x, y) = 8(zx, zy). 
Proof. lf x < y,thenzx < zy,andhencet 6(x, y) = Oandt 6(zx, zy) = 0. 
Ifx > y, then zx > zy using II.1.20; hence (II. 1.27) 
hx=y+ 6(x,y) (i) 

and 

F zx = zy + 6(Zx, zy) (ii) 
(i) yields 

F zx = zy + z6(x,y) 


using distribution of x over + (II.1.16), and therefore we are done by (ii) 
and JI.1.21. 


+ Or, more colloquially, “let z = @ work in (21”)”. We are using proof by auxiliary constant. 
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Now that we have mastered addition, subtraction, and multiplication, we 
turn our attention to division. First we formalize remainders and quotients. 
We start from the trivial observation 


+ (Az)\(Aw)x = yw+z (1) 


(1) follows from x = y0+ x (+1 and x1). 
Thus we may introduce a new function (of arity 2), the remainder function 
“r”, by the following ps-definition:' 


r(x, y) = (uz)\Aw)x = yw+z (rem) 
We have at once ((f), p. 218) 
F (aw)x = yw+r(x, y) (EQ) 


Therefore we can introduce quotients, g(x, y), via the definition of the 2-ary 
function symbol “q” below: 


q(x, y) = (uw)x = yw+rr,y) (Q) 


To improve readability we will usually denote the term g(x, y) by |x/y| or 


(note the boldface variables x and y, which distinguish the formal case from 
the informal one), and we will hardly need to refer to the symbol “q” again. 
One more application of (f) — to (Q) — yields 


x 
re=y H +r(x,y) (Euc) 
© The enabling condition (EQ) for (Q) yields 
- w)x = 0-w+r(x,0) 
Hence the interesting 
-x=r(x,0) 


Since + x = 0-0+r(x, 0), (Q) and (f™) (p. 218) yield |x /0| < 0. Hence, 
by ILLS, 


Lil g 


+ We write “r(x, y)” rather than “rxy” in the interest of user-friendliness of notation. 
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Adding the usual assumption y > 0, we can guarantee the “standard” prop- 
erties of quotient and remainder, namely the inequality’ r(x, y) < y and the 
uniqueness of quotient and remainder. 

To see this, add y < r(x, y). Thus (II.1.27) 


Fr(x,y)=yt+t (2) 


where we have sett = d(r(x, y), y) for convenience. By (Euc) and x 2 (invok- 
ing associativity tacitly, as is usual) 


Fxe=yS =| +t 
y 
Hence, 
F (dw)x = ywt+t 
from which, along with (rem) and (f) (p. 218), 
rr(x,y)<t 


Since alsol r(x, y) > t by (2) and IL.1.17, we gett r(x, y) = ¢ (II.1.8); hence 
- y = 0 from (2) and cancellation (II.1.21). By the deduction theorem we have 
derived 


Fy <rx,y)>y=0 
or, better still (via < 3, irreflexivity, and II.1.5), the contrapositive 
Fy >0—-ra,y)<y 
We next turn to uniqueness. Let! then 
Z<yA(w)x = yw+z (3) 


& (3) derives y > 0 by II.1.5 and II.1.6. 


By (f) and (rem), + r(x, y) < z. Since we want “=” here, we add 
r(x,y)<z (4) 
to obtain a contradiction. Reasoning by auxiliary constant, we also add 


x=yqtz (5) 


+ The “O < r(x, y)” part is trivial from + 0 < x (11.1.5) and substitution. 
® Cf. p. 209 for our frequent use of the argot “let”. 
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for some new constant gq. Let now s = 6(z,r(x, y)). By (4) and II.1.27, 
Fz=rx,y)+tsAs>0 (6) 


Thus (5) and (Euc) yield (via 1.1.21) + y |x/y| = yg +5; hence (definition 


of 6 and II.1.28) 
+96 ([F |) =: 
y 


We conclude that F y <s (why?), which, along with + z < y, from (3), 
and' s < z, from (6) and II.1.17, yields (transitivity of <) the contradiction 
Fy < y. Thus, 


Fz<yA (wx = ywt+z-z=rx,y) 


It is a trivial matter (not pursued here) now to obtain 
x 
be<yAx=ywtssw=|*|Az=riny) (UN) 
In (rem) (p. 224) the formula to the right of (tz) is provably equivalent to the 
Ao-formula (Aw)<,x = yw+ z (why?). Thus the latter could have been used 
in lieu of the original. 
Therefore the addition of r and its defining axiom to the theory results to a 
recursive extension (such as we promised all our extensions by definitions to 


be [p. 220]). & 


balla 


We next introduce the divisibility predicate “|” by the axiom (D) below: 
xly or(y,x) =0 (D) 


Once again, the use of boldface variables will signify that we are in the formal 
domain. 


111.29 Lemma. | y|x <> (Az)yz = x. 


Proof. By tautological implication, we have two directions to deal with: 
—: It follows by (Euc) above, +1, and Ax2. 


<: By the deduction theorem; assume (Az)yz = x towards proving y|x. 
Thus + (Az)yz + 0 = x by +1. By (rem) and (f), we have + r(x, y) < 0. 
Done, by antisymmetry and II.1.5. 


© We now define relative primality: We say that x and y are relatively prime iff 
F (vz)(xlyz — x|z) (7) 
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We introduce the metanotation R P(x, y) to stand for (Vz)(x|yz — x|z). Thus, 
we may write (7) ast RP(x,y). 

We emphasize that “R P” is not introduced as a predicate into the language,' 
rather it is introduced as a metamathematical abbreviation — that is, we write 
“RP(x,y)” simply to avoid writing the formula in (7). 

The technical reason is that we only introduce predicates if we are prepared 
to give a Ao-formula to the right of “<>” in their definition (since we want to 
only consider recursive extensions? of PA). & 


© I1I.1.30 Remark. Strictly speaking, RP as (informally) defined gives relative 
primality correctly, as we understand it intuitively, only when both numbers are 
nonzero. It has some pathologies: e.g, the counterintuitive 


+ RP(O, 2) 
Indeed, to prove (Vz)(0 | 2z — 0|z), strip it of the quantifier and assume 
r(2z,0) =0 
Hence (p. 224) 
+ 2z=0 
Thus (F a> 0,1 Oz = 0, and II.1.21)+ z = 0; hence 
FOlz 


Similarly, one can prove x > 0 —RP(0,x). 


Of course, in “reality” (i.e., informally), 0 and 2 are not relatively prime, 
since their greatest common divisor is 2. © 


11.1.31 Exercise. Show thatt RP(x,y) ~x >0Vy>0. 


11.1.32 Lemma. + (4z)(Gw)6(xz, yw) = 1 > RP(x, y). 
Proof. Assume the hypothesis 


(Az)(Aw)d(xz, yw) = 1 


+ That would have been through a definition like R P(x, y) © (Vz)(x|yz — x|z). 
= Cf. p. 220. 
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Let 6(xa, yb) = 15 where a and b are new constants. Add also 
x|yz (8) 


towards proving x|z. 


Now | z6(xa, yb) = z;' hence 
+ 6(zxa,zyb) =z (9) 


by IL.1.28. Setting c= |zy/x| for convenience,' we get + xc =zy from (8). 
Thus, by (9), F 6(zxa,xcb) = z; hence (again via II.1.28) 


+ x6(za,cb) =z 


That is,  x|z, and an application of the deduction theorem followed by gener- 
alization yields (Vz)(x|yz — x|z), ie., RP(x, y). 


S We have already remarked that we will be using associativity and commutativity 
of “++” and “x” without notice. er 


1.1.33 Lemma. | x > 0— RP(x,y) > RP(y,Xx). 


Proof. The case y = 0 is trivial by Remark II.1.30. Thus, take now the case 
y>0. 


We add the assumptions 


x>0 
RP(x,y) 
and 
y|xz (10) 
towards proving y|z. By (10), 
rxz=ya (11) 


where a = |xz/y|. Thus x|ya; hence x|a by RP(x, y). 


Write then a = xq (gq = |a/x|), from which and (11) + xz = yxq. Thus, 
+ z = yq, by IL.1.21 and our first hypothesis, which proves y|z. 


+ Assuming you believe that x = x1. 
= Another aspect of convenience is to invoke commutativity or associativity of either + or x tacitly. 
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Thus, what we have explicitly derived under the second case above (via the 
deduction theorem) was 


y>OFx>0— RP(x,y) > ylxz = ylz 
Hence we also derived (by V-introduction) what we really wanted, 
y>O0F x >0— RP(x,y) > (Vz)(y|xz — ylz) 


The moral is that even “formal”, but “practical”, proofs often omit the obvious. 
While we could just as easily have incorporated these couple of lines in the 
proof above, we are going to practise this shortening of proofs again and again. 
Hence the need for this comment. er 


11.1.34 Lemma. + k| p —RP(S(ip), Si + k)p)). 
Proof. The case p = 0 being trivial— | RP(SO, S0)—we argue the case p > 0. 
Add k | p towards proving 
RP(S(ip), Si + k)p)) (i) 
Thus, 
p=ka (ii) 
where a = | p/k]. By IL1.32 
+ RP(S(ip), p) (iii) 
and 
+ RP(SGp),k) (iv) 


each because 6(S(ip),ip) = 1. Add S(ip)|zS(@ + k)p) towards proving 
- S(ip) | z. 

By hypothesis, + S(ip) | zkp (fill in the missing steps); hence + S(@p)| zp 
by (iv). Then Sip) | z by (iii). 


We now embark on introducing coding of sequences formally.’ 


© The formal counterpart of a sequence dg, a), ..., A, of (variable) length = n+ 1 
is a term ¢(n, X), where the parenthesis notation lists all free variables of t. We 
may also simply write ¢[7] (see p. 18). 


¥ This will be a “careful” repetition of the definition of Gédel’s B-function ((2) of p. 159). 
t “n” is here a variable, not a numeral. That is why we wrote “n” rather than “n”. 
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We introduce the maximum of the first n + 1 members of a sequence 
max(l(@, )) = (uz)(Wi)<nz > t@,*)) (M) 
To legitimize (M) we need to establish 
+ Azy\(WI)G <n z > tG,xX)) (M’) 


We prove (M’) by induction on n. 
For n = 0 (M’) follows immediately (do [z — ¢(0, x)] and apply Ax2). 


Add now (M’) for frozen n and X, and show that 
F (Az\(WiG < Sn > z > tG,*x)) (M") 
Let a satisfy (M’) (where in the latter n and ¥ are still frozen). 
& That is, formally, add a new constant symbol a, and the new axiom 


(Vid <n —a>ti,®)) moe 


It follows (specialization) that 
Fi<n—-a>ti,Xx) (1) 


Since a + ¢(Sn,X) > aandt a +t(Sn,X) > t(Sn, xX), proof by cases, (1), 
andFi < Sni <nVi = Sn yield 


Fi< Sn—->a+t(Sn,xX) > t,x) (2) 
Hence (generalization) 
F (WIG < Sn > a+t(Sn,xX) > tG,x)) 
and via Ax2 
F (Az)(WiIG < Sn > z > ti, ¥)) 


This is (M"). Now the deduction theorem confirms the induction step. 


We next introduce the least common multiple of the first n + 1 members of 
a “positive sequence”: 


Icm(St(i,x)) = (uz\(z > OA (VWi)<nSt@, ¥) | z) (LCM) 
To legitimize (LCM) we need to establish 


F (Azz > OA (ViI)G <n > Sti, ¥) | z)) (LCM') 


+ With help from the logicali = Sn — t(i,¥) = t(Sn,xX). 
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One can easily prove (LCM’) by induction on n. In outline, a z that works for 
n = 0 is St(0,X). If now a (auxiliary constant!) works for n (the latter frozen 
along with ¥), then a St($n, xX) works for Sn. The details are left to the reader. 


The positive sequence above has members St(i, ¥), “indexed” by i, where ¢ is 
© some term. 
This representation stems from the fact that t > 0 —t = Sd¢t sl). 
Alternatively, we could have opted to ys-define Jem with a condition: that 
(Vi <n)t @,X) > 0. This approach complicates the defining formula as we try 
to make the pz-defined object total (when interpreted in the standard structure). cr 


Axiom (LCM) implies at once (f and f© of p. 218) 


LK @ > OA (Wi)enStl,¥)|z) + lem(Sti,#)) <z (1) 
fF Icm(St(i, X)) >0 (2) 

and 
L (Wi<n (sed, z)| lem(Sti,£))) G3) 


(1) can be sharpened to 


FZ > OA (Vi)<nSt,#) | z) > Lem(StG, #))|z (1’) 


Indeed, assume the left hand side of ““—>” in (1’) and also let 

r>0Az= Iem(St(i, X))q +r (4) 
where the terms r and q are the unique (by (2)) remainder and quotient of the 
division z / lem; <n(St(i, ¥)) respectively. That is, we adopt the negation of the 


right hand side of “—” in (1’) and hope for a contradiction. 
Well, by (3), (4) and our additional assumptions immediately above, 


Er > OA (Vi)<nSt@, x) |r 
hence 


fF Icm(St(, X)) <r 


by (1), contradicting the remainder inequality (by (2) the divisor in (4) is posi- 
tive). This establishes that rr > 0 is untenable and proves (1’). 


We now revisit Lemma 1.9.6 (actually proving a bit more). 


+ The big brackets are superfluous but they improve readability. 
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11.1.35 Lemma. Lett ands be terms. Then 


Fs > SOA (Vi)<nRP(s, Stli]) > RP(S, lem(St[E))) 


Proof. Let 
s > SOA (Vi)<nRP(s, St[i]) (5) 
Let also (we are implicitly using II.1.33) lemj<,(St[i]) | sz. By (3) and (5) 
F (Wi)<y St[i] | z 
Taking cases, if z = 0 then 
+ lem(Stlé) |z (6) 


anyway. If z > 0 then (6) is again obtained, this time by (1’). 


The following is the formalized Lemma 1.9.6. 


II.1.36 Corollary. Lett ands be terms. Then 
Es > SOA (Vi)<nRP(s, Stli]) > -s | Lem(Stli]) 


Proof. Exercise IL.11. 
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The following steps formalize those taken starting with p. 156. Note that c, 
© p, and q below are just convenient (metamathematical) abbreviations of the 
respective right hand sides. 
Let t(n, x) be a term, and set 


cH max(S¢[i]) (C) 


We next let p be the lcm of the sequence S0,..., Sc (informally, 1,...,c+1). 
Thus we set? 


p = lem(Si) (P) 
Finally, define the term q by the explicit definitiont 


q = lem(S(pStli)) (Q) 


+ That is, p stands for s[c], where s[n] abbreviates lem <n(Si ),n being a variable. 
= Of course, a more user-friendly way to write “S(pSt[i])” is “1+ pU +¢[i])”. 
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By (P) and (2) and (3) (p. 231) above, 
F p > 0A (Vi)<c Si | p (P’) 


We can now derive 
ky <c— (Wii <n Ay =1[i)) 


6 
— (Wit <n— RP(S(pSy), S(pSt{i]))) - 


To see why (6) holds, add the assumptions y < c andi <n — my = ¢[i]. We 
now try to prove 


i <n —RP(S(pSy), S(pStli])) 


So add i <n. This yields — y = t[f], which splits into two cases by < 3, 
namely, y > ¢[i] andy < ¢[i]. 

We will consider first the case y > t[i]. Setk = 6(y,t[i]) for convenience. 

NowF 0<kAk </y hence alsot k < ¢ by assumption and transitivity. 
Thus + k | p by (P’) and Exercise II.10. 

Moreover (by II.1.27) + y = t{i] +k, hence (via +2) Sy = St{i] +k. 
Thus IT.1.34 yields 


- RP(S(pSy), S(pStli})) 


The other case, y < ¢t[i], is handled entirely similarly with slightly different 
start-up details: This time we set k = d(¢[i], y). 

NowF 0<kAk <tf[i]; hence alsork k <c. 

Why? Well, k i <n — St[i] <c by (C) and (M) (p. 230) via (f) 
(p. 218). Now the assumption i < n yields + St[i] < c, and transitivity does 
the rest. 

Thus + k| p by (P’), and one continues the proof as in the previous case: 
By I1.1.27 + t{i] = y +k; hence (via +2) + Stfi] = Sy +k. Thus II.1.34 
yields 


F RP(S(pSy), S(pStli})) 


once more. 
At this point we have derived (by the deduction theorem) 
yser (7) 
<n 7y=1t[t)) > @ <n—- RP(S(pSy), S(pSt[i}) 
Hence, by V-monotonicity (1.4.24), 
y<ck , 
(7') 


(Vi Sny(ry =¢t[t]) > (Vi < mRP(S(pSy), S(pStli)) 
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(6) now follows by the deduction theorem. 


We immediately derive from (6), (Q), and II.1.36 that 
Fy Se > (Wi <n(y = tli) > “S(pSy)|¢q (8) 
Hence, by tautological implication, 
Fy Se > S(pSy)|q > Gi < my = tf] (8’) 
Thus, informally speaking, q “codes” the unordered set of all “objects” 
T = {S(pSt[i]) :i <n} 


in the sense that if x is in T, then x | qg, and, conversely, if S(pSy)|q — where 
y <c-—then S(pSy) is in T. By coding “position information”, i, along with 
the term ¢[i], we can retrieve from q the ith sequence member t{i]. 

To this end, we define three new function symbols, J, K, L, of arities 2, 1, 
and | respectively: 


Ia, y=@+yP +x (J) 

where “(x + y)?” is an abbreviation for “(x + y) x (x + y)”, 
Kz = (px)(x = Sz V Ay)<,J(x, y) = z) (kK) 
Lz = (uy )y = Sz V (Ax)<. J, y) = Zz) (L) 


J, K,L are the formal counterparts of J, K, L of p. 156. 
To legitimize (K) and (L) one needs to show 


F (Ax) = Sz V (Ay)<:J(®,¥) = 2) (K’) 
and 

F (yy = Sz V (Ax)<. J, y) = 2) (L’) 
They are both trivial, since + Sz = Sz. 


1.2.1 Lemma. | J(x,y) = J@,b) ~x =aAy=b. 


Proof. A straightforward adaptation of the argument following (*) on p. 156. 


11.2.2 Lemma. | K J(a,b) =a and+ LJ(a,b) = b. 


Proof. We just prove the first contention, the proof of the second being entirely 
analogous. 
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First, it is a trivial matter to prove} x < J(x,y) andt y < J(x, y) (Exer- 
cise II.12). Nowk b < J(a,b) A J(a,b) = J(a,b); hence 


Fa = SJ(a,b)V (Ay)y < J(a,b) A Ja, y) = J(a,b)) (1) 
By (K), (1) above, and (f) (p. 218) we have 
+ KJ(a,b)<a (2) 
while (K) and (f) (p. 218) yield 
L KJ(a,b) = SJ(a,b) 
V (yy < Ja, b) A JK Ia, b), y) = Ja, b)) (3) 
Since K J(a,b) = SJ(a,b) is untenable by (2), we get 
F (Ay)y < J@,b) A JK I(@,b), y) = J(a,b)) (4) 


Let ce < J(a,b) A J(K J(a,b),c) = J(a,b), where c is a new constant. By 
11.2.1,- KJ(a,b) =a. 


To conclude our coding, whose description we launched with the @-sign on 
p. 232, let finally a[n] be a term. 

We code the sequence a(n, ¥), fori <n, by following the above steps, letting 
first ¢ of the previous discussion be an abbreviation of the specific term below: 


t(i, x) < J(i, ai, ¥)), where i and X are distinct variables (T) 
Thus, by (8’) (p. 234) and substitution, we have 
F JG,m) <e A S(pSJ(i,m))|q — 
(Aj <n)JG,m) = JU, als) 
which (by II.2.1*) yields 
+ JG,m) <c A S(pSJ(i,m))|q — m=ali]Ai <n (5) 
This motivates the definition (where d is intended to receive the “value”? 
J(c,q)) 
II.2.3 Definition (The Formal £). 
B(d,i) = (um)(m = d V S(pSJ(i, m)) | Ld) (B) 
Th GEpG <n Ji,m) = JVj,alj))) — m = ali] Ai <n. To see this, assume hypothesis 


and use a new constant b to eliminate (4/). 
tof course, regardless of intentions, the letter d in the definition (B) is just a variable, like i, m. 
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The letter p in (B) is an abbreviation for the term [cmj<xq(Sj) (see (P), 


p. 232). & 


That (B) is a legitimate definition, that is, 


- (Am)(m = d V S(pSJ(i,m)) | Ld) (B’) 


follows from x = x. 


II.2.4 Proposition. 


(i) | B@,i) < x. Moreover, 
(ii) F B(x,i) << x & (Am)(S(pSJ(@, m)) | Lx), where p = lemj<xx(Si). 


Proof. (i) is immediate from (B), + x = x and (f) (p. 218). 


(ii): The —>-part is immediate from (B) and (f) (p. 218). 

As for <—, assume (Am)(S(pSJ(i, m)) | Lx). 

Let S(pSJ@,r))| Lx, where r is a new constant. Hence + S(pSJ(@,r)) 
< Lx. We also have + Lx < Sx by (L) (p. 234) and (f) (p. 218); thus 
F pSJ(i,r) < x by transitivity and II.1.9 (contrapositive). Butt r < pSJ(i,r) 
byt y < J(x, y) (Exercise II.12); hence r < x. Since G(x,i) <r by (B) 
and ( f) (p. 218), we are done. 


All this work yields the “obvious”: 


11.2.5 Theorem. For any term a(i,X), 
F (Wx1). + (WXm)(Vay(Az)(VOE <n > BE, i) = ali, ¥)) (6) 


where m is the length of x. 


Proof. We prove instead 
F (Az (Wi)G <n > BE, i) = ali, ¥)) 


The proof constructs a “z that works” (and then invokes Ax2). To this end, 
we let ¢ be that in (7), and in turn, let c, p, g stand for the terms in the right 
hand sides of (C), (P), and (Q) respectively (p. 232). Setting for convenience 
d = J(c,q), we are reduced to proving 


i <n— P@,i) =aG,X) (7) 
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Thus we add the assumption i < n. We know 
+ J(i,aGi,¥)) <e, by (C), (M) on p. 230, and (f) (p. 218) 
- S(pSJG,aG, ¥))) | q, by (Q) and (3) on p. 231 
Or, using the abbreviation “d” and II.2.2 
+ J(@i,aG,¥)) < Kd 
+ S(pSJ(i, ali, #))) | Ld 
Thus, 
F (Am)(S(pSJ (i, m)) | Ld) 
The above existential theorem and II.2.4(i7) imply 
+ Bd,i)<d 
so that (B) (p. 235) and (f™) — through k =G(d,i) = d, by (10) — yield 


F S(pSJG, Bd, i)))| Ld 
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(8) 
(9) 


(10) 


(11) 


By (9), (B), and (f®) (p.218),+ G(d,i) < a(i,¥); hence, since J is increasing 


in each argument (do you believe this?), (8) implies 
F JG, Bd,i)) < Kd 
Combining the immediately above with (11), we obtain 
- Ji, Bd,i)) < Kd A S(pSJ@, B@,i))) | Ld 
Now (5) on p. 235 yields 
+ B(d,i) = ali, x) 
By the deduction theorem, we now have (7). 
11.2.6 Corollary. For any term a(i,X), 
F (Wxq).- + (VXm)(Vn)(Az)(Vi)i<n(BZ,i) < z A BZ, i) = ai, ¥)) 


where m is the length of x. 


Proof. By (10) of the previous proof. 
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II.2.7 Example (Some Pathologies). (1) By II.2.4(Z) we get 6(0,i) = 0 
(using II.1.5 and II.1.8). Thus, if we introduce an 1-ary function letter f by the 
explicit definition fn = 0, then i <n — G(0,i) = fi. It follows, accord- 
ing to (6) of II.2.5, that 0 “codes” the sequence of the first n members of the 
term fi — for any n “value”. 

(2) Next, “compute” G(3,i). Now, + K3 = 4 and L3 = 4 (why?). Since 
F p= 60 (why?), we get 


t =S(pSJ(i,m)) | L3 
since + S(pSJ(i,m)) > 61 (why?). By IL.2.4(ii), + BG, i) =3.' 
Thus, if we have a function symbol g witha definition gn = 3, then a possible 
proof of 
F (Az)(Vi)i < w — BG,i) = gi) 


starts with “take z to be 3”, An alternative “value” for z is the “d”’ constructed 
in the proof of II.2.5, adapted to the term gn. We may call the latter z-“value” 
the “intended one” or the “natural one”’. 


Clearly, “intended” or not, any z that works in (6) of II.2.5 is an in principle 
acceptable coding of the first “ members” of a term a. 


(3) Finally, let us compute 3(2,i). Now, + K2 = 1 and+ L2=0. Also 
- p = 2. It follows that 


+ B2,i) =0 
since 
F S(pSJ@, 0)) | 0 
Thus, if f is introduced as in part (1) of this example by fn = 0, then 
F (z)(Viyi < w — BG,i) = fi) 


can be proved by letting z be 2, or 0, or by invoking the construction carried 
out in the proof of II.2.5.* In particular, G is not /-J in its first argument. 
Part (3) of this example shows that an x that passes the test “G(x,i) < x” 


is not necessarily the d computed in the standard manner as in the proof of 
II.2.5 —i.e., we cannot expect F x = d. After all, B(2,i) <2. 


+ This is related to the example BG, i) =4 of 1.2.14. 
+ If the latter construction is followed, then! Lz > 0, of course. 
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We are all set to introduce the formal counterparts of (...), Seq, lh, *, (z); 
of p. 165. 


II.2.8 Definition (Bold (...}). For any term ¢ and any variable w not free in 
t, we denote by (t[i] : i < w) (or, sloppily, (¢[0], ...,¢[w — 1])) the u-term 


(uz)(BE, 0) = wA (Vi)<w(B@, Si) = Sti) (FC) 


II.2.9 Proposition. The p1-term in (FC) can be formally introduced. 
Proof. We want 


F (2z)(B@, 0) = wA (Vi)<w(B@, Si) = St[t])) (FC’) 


Let a[w, n] abbreviate the term defined by cases below (see p. 221): 


atwsmi={% ee: (*) 
St[6(n, S0)] ifn >0 
(«) yields (15) on p. 221) 
-}n=0—al[w,n] = w 
and hence (Ax4) 
- a[w, 0] = w (1) 
but also 
En >0— alw,n] = St[6(~m, SO0)] 
Hence (by + Sn > 0 and modus ponens) 
F a[w, Sn] = St[6(Sn, S0)] 
or, using + 6(Sn, $0) = n (do you believe this?) 
- a[w, Sn] = St[n] (2) 


Now, by Theorem II.2.5, 


F Az) < w— B(Z,t) = alw,i)) 
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In view of the above existential statement, we introduce a new constant c and 
the assumption 


(Vi)G < w— BCC, i) = a[w,t]) 
By specialization we obtain from the above 


F-0<w-— Bc, 0) = a[w, 0] 


that is, 
F B(c, 0) = a[w, 0] (3) 
by IL1.5, and 
F Si << w— fB(c, Si) = a[w, Si] 
that is, 


-bi<w— G(c, Si) = alw, Si] (4) 


in view of II.1.10. Putting the generalization of (4) together (conjunction) 
with (3), via (1) and (2), yields 


F B(c,0) = wA (Vi)<w(B(c, Si) = St[i)) 


from which Ax2 yields (FC’). 


II.2.10 Definition. We introduce the functions “Th” and “(...).,..” by 


Ih(z) = BC, 9) 


: (+) 
(z)i = 6(GG, St), SO) 


In the second definition in the group (+) above we have introduced a new 
2-ary function symbol called, let us say, f, by fzi = 6(G(z, Si), SO) — in 
informal lightface notation, B(z,i + 1) — 1 — and then agreed to denote the 
term “fzi” by (z);./ 


I1.2.11 Proposition. [f we let b = (t[i]:i <x), then we can obtain 
(1) F lh(b) = x, 

(2) F (Wi)<x(b); = tli], or, equivalently, F (Vi) <ino)(b); = tli], and 
(3) Fz <b> Clh(z) = x V (i)ex7@); = tli). 


+ Note how this closely parallels the “(z);” of the prime-power coding. We had set there (p. 137) 
(Z)i = exp, z) — 1. 
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Proof. (1): By (f) and (FC) (p. 239), 
F B6,0) =x 


We conclude using II.2.10. 
(2): By (f) and (FC), 


bi <x — BO, Si) = Stfi] 


We conclude using + 6(St[i], SO) = ¢[é] and II.2.10. 
(3): To prove the last contention we invoke (f Q)), p. 218, in connection 
with (FC). It yields 


F (Vz) <p (BE, 0) = x A (Vi)i < x > BE, Si) = Sti) 


Hence, using II.2.10, the Leibniz rule, and specialization, 


Fz <b > (-lhG) =x V Gi)<x@); = tl) 


Item (3) above suggests how to test a number for being a sequence code or 
not. We define (exactly as on p. 165, but in boldface) 


II.2.12 Definition. We introduce the unary predicate Seq by 


Seq(z) — (Vi) cing B &, Si) > OA 
(Vx)<cUh(x) A IN(Z) V (Fi) cing) (Zi F (Xi) 


The first conjunct above tests that we did not forget to add | to the sequence 
members before coding. The second conjunct says that our code z is minimum, 
because, for any smaller “number” x, whatever sequence the latter may code 
(“minimally” or not) cannot be the same sequence as the one that z (minimally) 
codes. 


II.2.13 Proposition. | Seq(z) > (Vi<ing@z > (2)i- 


Proof. Assume the hypothesis. Specialization and II.2.12 yield 
Fi <lh@) > BE, Si) > 0 (1) 


By I1.1.10,F B(z, Si) > 0 — Bz, Si) > SO; thus, by 1.1.27 and II.2.10, (1) 
yields 


Fi <Ih(z) — Bz, Si) = (z); + SO 
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By +2 and +1 we now get 
Fi <Ih(zZ) > BG, Si) = S(@)i) 


which, by II.2.4(7), rests our case. 


& II.2.14 Remark. The inequality 
(Zi <z (i) 


is very important when it comes to doing induction on sequence codes and 
(soon) on Gédel numbers. 

By Example II.2.7 we see that unless we do something about it, we are not 
justified in expecting F G(z,i) < z, in general. What we did to ensure that in- 
duction on codes is possible was to add 1 to each sequence member ¢[i], a 
simple trick that was already employed in the prime-power coding (p. 136), 
although for different reasons there.' This device ensures (7) by II.2.13. 

Another way to ensure (i) is to invoke Corollary II.2.6 and modify (FC) to 
read instead 


(uz)(B(z,0) = wAw< ZA (Videw(B(z, Si) << z A Bz, Si) = t[i])) 


Note that we did not need to “add 1” tot above. 

We prefer our first solution (Definition II.2.8), if nothing else, because it al- 
lows 0 to code the “empty sequence” (see II.2.15 below), a fact that is intuitively 
pleasing. 

This is a good place to mention that while our “3” (in either the bold or 
the lightface version) is, essentially, that in Shoenfield (1967), we had to tweak 
the latter somewhat (especially the derived “(...)}’) to get it to be induction- 
friendly (in particular, to have (i) above). 

The version in Shoenfield (1967, (5), p. 116) is 


Bla,i) =(ux\(x =aV 


. (Sh) 
(Ay) <a(Az)<ala = J'(y,2) A S(ZSJ'(x, i)) | y)) 


wheret 
Jy a=ytaytytl 
Since, intuitively speaking, 4 is not in the range of J’, i.e., formally, 


+ A(Ay)(S2)J’(y,z) = 4 


+ Namely, to enable us to implicitly store in the code the length of the coded sequence. 
* The two “J’”’s, the one employed here and Shoenfield’s J’, trivially satisfy J’(y,z) = SJ(y,z). 
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we have 
+ B4,i)=4 


because the search (Sy) <q(Az)<a(...) in the defining axiom (Sh) fails. This in- 
validates (7), p. 116 of Shoenfield (1967),' on which (i) hinges in that book.? © 


1.2.15 Example. Since + 0 = G(0, 0) by II.2.7, we have + Ih(0) = 0, but 
© also, by < 1, that 0 is a minimum code of a sequence. Indeed, Seq(0) by < 1. 
Since the sequence in question has 0 length, it is called the empty sequence. 

We often write the minimum code, (FC), of the empty sequence as “( )”, that is, 


F()=0 © 


To conclude with the introduction of our formal coding tools we will also 
need a formal concatenation function. 


II.2.16 Definition (Concatenation). “x” — a 2-ary function symbol — is intro- 
duced via the jz-term below (denoted by “x * y”): 


x*y = (uz)(BZ, 9) = Ih(x) + lh(ly) A 


(Vi)<inx) BG, Si) = S((x)i) A (ee) 
(Vi)<inyy BG, Si + Ih(x)) = S((y)i)) 


The legitimacy of (+) relies on 
II.2.17 Proposition. 


F (Aw)(G(w, 0) = h(x) + lh(y) A 
(Vi) <inx) Bw, Si) = S((x)i) A (1) 
(Vi) <iny) B(w, St + Ih(x)) = S((y)i)) 


Since the x and y are free variables, (1) says, informally, that for all “values” of 
@ x and y a w that “works” exists. That is, the natural interpretation of * over St is 

total. In other words, just like the “x” that we saw earlier on, which was based 

on prime power coding (see 1.8.30, p. 136), this new “*” makes sense regardless 

of whether or not its arguments are minimum codes according to (FC). © 


+ Thatt} na = 0 B(a,i) <a. 
t (8) in loc. cit., p. 117 [deduced from (7) in loc. cit.]: -a #0 — Ih(a) <a A (a)ji <a. 
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Proof. To see why (1) holds, we introduce the 3-ary function symbol a by 


Ih(x) +Ih(y) ifn =0 
a(n, x,y) = } S()s,so)) if0<nAn<Ih(x) (A) 
S(W)sm,stn@y) if lh(x) <n 


By 1.2.5, 
F (Aw)(VG < Ih) + Ih(y) > B(w,i) = ali, x, y)) 
Let us add a new constant, z, and the assumption 
(Vi)G < Thx) + IN(y) > BG,i) = aG, x, y)) (2) 


We now show that we can prove (1) from (2). 


ot may be that it is not the case that F Seg(z). This is just fine, as the proof 
needs no such assumption. ® 


We verify each conjunct of (1) separately: 
e Since (A) yields (p.221)-n = 0 — a(n, x, y) = lh(x) + lh(y) and hence 
F a(0,x,y) =lh(x) + Ih(y) 
(2), specialization, and II.1.5 yield 
F BZ, 0) =1h(x) + Ih(y) (3) 
¢ Next we prove 
Fi < lh) > BE, Si) = S((x);) (4) 
By II.1.10 this amounts to proving 
F Si <Th(x) — B(@, Si) = S((x);) (4) 
Now, since F [h(x) < Ih(x) + Ih(y), (2) yields 
F Si <Th(x) — Bz, Si) = a(Si, x, y) 
Moreover, (A) and' 0 < Si yield (p. 221) 
F Si <Th(x) — a(Si, x,y) = S((x)o(si,s0)) 
and since + 6(Si, $0) = i, we get (4’) above. 


¢ Finally, we want to verify 


Fi <Ih(y) > BG, Si + lh(x)) = S()i) (5) 
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which amounts to 


F Si <Ih(y) > BE, Si + h(x) = S((y)i) 


using I1.1.10. Add Si < lh(y). Hence | Si + Ih(x) <lh(x)+Ilh(y) by 


II.1.17. Thus, by (2), 
+ B(z, Si + Ih(x)) = a(Si + Ih(x), x, y) 
By (A), 
F Ih(x) < Si + Ih(x) 
— a(Si + h(x), x,y) = SCY )6(si4tnx),Sdn(x)))) 
Hence, noting that 
F Ih(x) < Si + Ih(x) (by II.1.17) 
F Si +1lh(x) =i+ SCh(x)) 
and 
F 6G + SCh(x)), SUh(x))) =i (by II.1.27) 
(6) and (7) yield 
F B@, Si + Ih(x)) = S((y)i) 
which by the deduction theorem yields (5’) and hence (5). 


Putting now the conjuncts (3)—(5) together and applying Ax2 yields (1). 


© II.2.18 Example. We verify that for any terms ¢ ands 


F (tli]:i <n) * (s[i]:i <m) = (ali]:i <n+m) 


where 
t[i] ifi<n 
ali]J=4s[6G,n)] ifn<iAi<n+m 
0 otherwise 


Setting x = (t{i]:i <n) and y = (s[i]:i < m) for convenience, we get 


from IL.2.11 
-lh(x)=n 
- lh(y) =m 
Fi<cn— (x) =¢[i] 
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and 
Fi<m—(y) =sf{i] 


Pause. (Important.) None of the above four facts need the assertion that x and 
y are minimum “(...)-codes”. We have invoked above only the part of II.2.11 
that does not rely on minimality. Only the last claim in II.2.11 does. 


Thus, by II.2.16, 
Fxey = (uz)dhg)=n+mA 
(Vi) <nB(z, Si) = Stli] A (8) 
(Vi) <mB(z, Si +n) = Ss[i]) 
On the other hand, 
F (afi]:i <<n+m) = (uw)dh(w) =n+mA 

(Vi) <nB(w, Si) = Sti] A (9) 
(Vi)<mB(w, Si + n) = Ss[i]) 


by the way a was defined (noting that n<i-+nandt 6@-+ n,n) =i). 
By (8) and (9), we are done. 


Pause. Parting comment: From our earlier Pause, we see that even in the case 
when x and y are not the minimum (...)-codes for the sequences ¢[i] :i <n 
and s[i] : i < m respectively, but nevertheless happen to satisfy the following, 


-lhx)=n 
- Ih(y) =m 
Fi<n— (x) =¢[i] 


and 
Fi<m—(y) =sf{i] 


Then the work immediately above still establishes 


Exxy = (afi]:i <<n+m) © 
II.2.19 Proposition. [f we let b abbreviate (t[i]: i < x), then we have 


t+ Seq(b) 


Proof. Immediate from II.2.8, I.2.11, and II.2.12. 


1I.2.20 Exercise. | Seq(x * y). 
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II.2.21 Example. We introduce the abbreviation t below: 
t = (uz)(Seq(z)A 
Uh(z) = lh(x) + Ih(y) A 
(Vi) <in(x)(Z)i = (Xx) A 
(Vi) <in(y) @)i+4they = (Y)i) 


(1) 


and prove 
Ft=xxy 
By the definition of “x” (II.2.16) and (f®) (p. 218) — using 
Kx = Sy > 6(x,S0=y 
(by II.1.27) and II.2.10 to remove instances of 3 — we obtain 
Elh(x * y) = Ih(x) +Ih(y)A 
(Vi) <inoy(X * y)i = (x); A (2) 
(Vi) <in(y)(% * Y)itiney = (Vi 
Since also Seq(x * y), by II.2.20 we get 
Ft<xxy (3) 
by (2), (1), and (f®) (p. 218). 


To get the converse inequality, we use (1) and (f) to get 
F Seq(t)A 
Uh(t) = lh(x) + Ih(y) A 
(Vi) <inxy(t); = (x); A 
(Vi) <inyy Oi) = (Wi 


(4) 


The first conjunct of (4) and II.2.12 (via II.1.27) yield,* from the remaining 
conjuncts of (4), 


+ Bt, 0) = Ih(x) + lh(y)A 
(Vi) <iney BE, Si) = S((X)i)A 
(Vi) <iny) BE, Si + Th(x)) = S((y)i) 
By the definition of “x” (II.2.16) and (fF), 
Fxey <t 


which along with (3) and II.1.8 rests the case. 


+ More expansively, + (Vi) <int) B(t, Si) > S(O) by IL.2.12, which implies (by II.1.27 and II.2.10) 
F (Vi)<inw Bt, Si) = (t); + SO, but z + SO = Sz. 


<4 
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© 1.2.22 Example. By II.2.18, 


+ O* (tfi]:i <x) = (tfi]:i <x) 
and 
F ¢tfi]:i <x) *0= (tfi]:i <x) 


In particular, / 0 * () = () (cf. IL.2.15), ie., F 0 * 0 = 0. Note however that 
if a is the term that denotes the natural coding of the empty sequence, that is, 
+ B(a,0) = Oandt 0 < a (note that k —Seq(a)), then 


-}0*«xa=0 


since F Seq(0 * a), and therefore 


fA0*xa=a 7 


Thus we have two ways to “G-code” sequences t[i], fori < x. One is to pick 
any c that satisfies F G(c,0) = x and (Vi)<,G(c,i) = tfi]. The other is to 
employ the minimum “(...)-code”,b = (¢[i] :i < x). By IL.2.19, b (but not 
necessarily c) “satisfies” Seq. 


II.3. Formal Primitive Recursion 


II.3.1 Theorem (Primitive Recursive Definitions). Let h and g be n-ary and 
(n + 2)-ary function symbols. Then we may introduce a new (n+ 1)-ary func- 
tion symbol, f, such that 


+ £0,5) = hG) 
and 
b £(Sx,F) = ge, F, SOF) 
Proof, We prove 
+ Gz)(Seq(e) Ath) = Sx A @o = HF) < 
AWiex(@)si = GF, i) 


This is done by formal induction on x. 


For x = 0, taking z = (h(¥)) works (this invokes < 1 and Ax2). 
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Assume now (1) for frozen variables (the I.H.). For the induction step we 
want to show 


+ Gz)(Seq(z) A th(z) = SSx Ao = WF) 
A Wi)<sel(z)si = 8,5, @i))) 


To this end, let a be a new constant, and add the assumption (invoking 1.4.27 
and (1)) 


(2) 


+ Seq(a) Alh(a) = Sx A (a)o = hy) 


‘ nS (3) 
A (Vi)<x((@)si = gi, y, (a);)) 
Set now 
b=a* (g(x, ¥, @)x)) 
By II.2.21 and (f®) (p. 218) 
+ Seq(b)A 
Ih(b) = SSx A 
(4) 


(Vi)<sx(b)i = (a)i A 
(b)sx = g(x, ¥, (@)x) 
the last conjunct being distilled from 
Fi < S0 — (D)i+sx = g(x,y, (a)x) 


andFi < S0Hi=0. 
Thus, using (3) and the Leibniz axiom (Ax4), 


- Seq(b)A 
Ih(b) = SSx A 
(b)o =W(¥)A 
(V<xrO)si = 8G, ¥,(B)i)A 
(b)sx = g(x, ¥,(b)x) (since F (6), = (a) by (4) 


In short, using (1.7.2) @)sx = g(x, ¥, ®)x) 3 (Vi)i=xO)si = 9G, ¥, )i), 
V-distribution over A, and < 2, we have 


F Seq(b) Alh(b) = SSx A (b)o = WY) A (Vidcsx(b)si = 8G, ¥, (B)i) 
This proves (2) by Ax2 and concludes the inductive proof of (1). We can now let 
P(x, 5) = (nz)(Seq(z) AIhG@) = Sx 
0 = WF) A (Widex(@)si = gi, F,@i))) 


where F is a new function symbol. 
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By (f) (p. 218), 

F (FO, Y))o = Wy) 

Fi<x > (FOR,Y)si = 8G, y¥,(FR,Y))) 
The first part of (5) yields (by substitution) 

F (FO, ¥))o = hY) 
The second yields (by substitution [x  Si]) 
F (F(Si, ¥))si = g@, ¥, (F (Si, ¥))i) 

We now claim that 


+ (F(Si, ¥))i = (FG YD): 


It is convenient to prove a bit more, by induction oni, namely, that 


F (Vx)G <x > (FG,Y)); = (FO, Y))) 
The case fori = 0 follows from (6) and the first of (5). 


We now assume (9) as our I.H., freezing the free variables (i and y). 


For our induction step we need to prove 
(Vx)(Si <x — (F(Si, ¥))si = (F(X, Y¥))si) 
but we prove instead 


Si <x — (F(Si, ¥))si = (FO, Y))si 


To this end, we assume Si < x — freezing x — and proceed to verify 


+ (F(Si, ¥))si = (FO,Y))si 


(5) 


(6) 


(7) 


(8) 


(9) 


(10) 


Well, since also i << x andi < Si, the IH. (9) yields by specialization 


F (FU, ¥))i = FO, YD 
and 
F (FG, ¥))i = (FSi, Yi 
Therefore, 
F gl, ¥, (FSi, ¥))i) = gG,¥, FO, ¥))i) 


which, by the second part of (5) and (7), verifies (10). 
Having thus established (9) and therefore (8), (7) becomes 


F (F(Si, ¥))si = ei, y, (FG, ¥))i) 


1.3. Formal Primitive Recursion 251 


All this proves that if f is a new function symbol introduced by 


S(x,y) = (F(x, ¥))x 


then the two formulas stated in the theorem are proved. 


© 11.3.2 Exercise. Verify that if and g are ys-defined from Ao(La) formulas, 


then so is f. © 


II.3.3 Theorem (Course-of- Values Recursion). Leth and g be function sym- 
bols of arities n and n + 2 respectively. Then we may introduce a new function 
symbol H of arity n + 1 such that 


F (H(0,¥))o = h¥) 
F (A(Sx,Y))sx = g(x, ¥, (x, ¥)) (1) 
F Seq(H(x, ¥)) AIh(H(x, ¥)) = Sx 


Proof. Invoking II.3.1, we introduce a new function symbol H by the primitive 
recursion below, and show that it works: 


H,5) = (h@)) m 
(Sx, ¥) = A(x,y) * (g, J, H(x,¥))) 


The third formula in the group (1) is proved by induction on x. The basis is 
immediate from the basis of the recursion (2) (note that a“‘{...})”’ term satisfies 
Seq by I1.2.19). 

Assume now the contention for frozen x. By II.2.21 and the second formula 
in group (2), 


- Ih(A(Sx, ¥)) = SSx 
and 
+ Seq(H(Sx, y)) 


This concludes the induction. The other two contentions in the theorem are 


direct consequences of the two recurrence equations (2). 


© 11.3.4 Remark. Introducing yet another function symbol, f, by 


f@Y) = (A(x, ¥))x 
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yields at once 

F f0,¥) = hy) 

rE f(Sx,¥) = g(x,y, A(x,y)) 


This is the standard way that course-of-values recursion is presented, i.e., 
defining a function f from known functions h and g and from the “history”, 


A(x, y) = (fG,¥):i <x), of f. ® 


We recall the concept of a term-definable (total) function (p. 181). We can now 
prove 


11.3.5 Theorem. Every primitive recursive function is term-definable in (some 
recursive extension of ) PA. 


Proof. The proof is by (informal) induction on BR. For the basis, we already 
know that the initial functions are term-definable in ROB, a subtheory of PA. 


Let now f, g1,.--, 8n be defined by the terms f, 21, ...,2,. We will argue 


that the term f(g1(¥in), +++ 58n(¥m)) defines A¥m.f(g1(¥m), »- ++ &nm))- 
So let f(g1(Gm), -- +; &n(Gm)) = b be true. Then, for appropriate ¢,,, 


f(ci,.--,€n) =b 


g1(Am) = Cf 


Sn Gio 
By the induction hypothesis, 
PPE = 0 


F gi(dj,...,4m) = C1 


F gn(Q,.--, 4m) = Cn 
Hence (via Ax4) 
Pe CiGixnG ety eaGicatyen) =O 


Let finally h and g be term-defined by h and g respectively, and let f be given 
by the schema below (for all x, y): 


FO, ¥) = AY) 
fe+1,y) = 8, y¥, fy) 
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We verify that the term f(x, ¥), where f is introduced by formal primitive 
recursion below, defines /: 


f(0,¥) = hy) 


= = - (1) 

f& = 1,y) = g(x,y, f@,y)) 

To this end, we prove by (informal) induction on a thati 
f(a,b)=c implies + fG@,b)=2 (2) 


Let a = 0. Then f(0, b) =c entails h(b) = c; hence + h(b) =7 by the LH. (of 
the §88-induction). Ae 

By the first equation of group (1), f(0, b) =, which settles the basis of 
the a-induction. Now fix a, and take (2) as the I.H. of the a-induction. We will 
argue the case of (2) when a is replaced by a + 1. 

Let f(a + 1, b) =c. Then g(a, b, d)=c, where f(a, b) =d. By the LH. of 
the §8R-induction and a-induction, i.e., (2), 


EPG =a 


and 


Thus, by Ax4 and the second equation of group (1), 


Eee 


The above suggests the following definition. 


II.3.6 Definition (Formal Primitive Recursive Functions). A defined func- 
tion symbol f is primitive recursive iff there is a sequence of function symbols 
&1,---,Zp such that g, = f and, for alli = 1,..., p, g; has been introduced 
by a primitive recursive definition 


gi (0, Vlyees >Un—1) = giv, eee >Un—1) 
Bi (SV0, U1, ++ +5 Un—1) = Bk(V05 Vig ++ +5 Un—15 Si (VO, +++) Un—1)) 


where j < i andk < i and the g; and g, have arities n — 1 andn + | respec- 
tively, or by composition, that is, an explicit definition such as 


Bi(V05 +++ 5 Un—1) = B jy (Sj (Vo9 + © Un—1)5 + + + 9 Bj, (V09 ++ + 5 Un—1)) 


t b means by, b> Kode 
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using, again, previous function symbols (of the correct arities) in the sequence 
(Le., jm < i,m =0,...,1r). Otherwise, g; is one of S, Z (zero function symbol), 
or U}' (n > 0,1 <i < n) (projection function symbols), where the latter two 
symbols are introduced by the defining axioms 


Z(vo) = 0 
and 
U}" (v0, + +5 Un—1) = Vi-1 for each positive n and 1 <i <ninN. 
A sequence such as g1,..., 2p is a formal (primitive recursive) derivation of 
8p- 


A term is primitive recursive iff it is defined from 0, variables, and primitive 
recursive function symbols according to the standard definition of “term”. 


A predicate P is primitive recursive iff there is a primitive recursive function 
symbol f of the same arity as P such that P(x) <> f(x) = 0 is provable. 

By aslip of the tongue, we may say that P is primitive recursive iff its formal 
characteristic function, vp, is. (See however II.1.25, p. 220.) 


II.3.7 Remark. In order that the primitive recursive derivations do not mo- 

© nopolize our supply of function symbols, we can easily arrange that primitive 
recursive function symbols are chosen from an appropriate subsequence of the 
“standard function symbol sequence f;"” of p. 166. We fix the following very 
convenient scheme that is informed by the table on p. 138. 


Let a be the largest f-index used to allocate all the finitely many function 
symbols required by Lsy (three) and the finitely many required for the introduc- 
tion of G and the (...) coding. 

We let, for convenience, b = a+ 1, and we allocate the formal primi- 
tive recursive function symbols from among the members of the subsequence 
(fp,k=0, being very particular about the k-value (k-code) chosen (p. 138): 


(1) k = (0, 1, 0) is used for Z (n = 1, of course). 
(2) k = (0, 1, 1) is used for $.4 
(3) k = (0,n, i, 2) is used for U;’. That is, U;’ is allocated as 


i, 
b(0,n,i,2) 


¥ It does no harm that S$ is already allocated as ie After all, every “real” primitive recursive 
function — i.e., function viewed extensionally as a set of input-output pairs — has infinitely many 
different derivations, and hence infinitely many function symbols allocated to it. 
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(4) k = (1,m, f, 91,.-., 8n) is used if fFF is allocated to denote the result of 
composition from function symbols (already allocated) with k-codes equal 
to f, 1,---, 8n- Of these, the symbol with code f must have arity n; all 
the others, arity m. 

(5) k = (2,n+1,h, g) is used if i is allocated to denote the result of 
primitive recursion from function symbols (already allocated) with k-codes 
h, g. Of these the first must have arity n, the second n + 2. 


This allocation scheme still leaves an infinite supply of unused (so far) 
symbols to be used for future extensions of the language. 

As a parting comment we note that the seemingly contrived allocation 
scheme above forces the set of k-codes to be primitive recursive. (See Ex- 


ercise II.13). & 


By the proof of II.3.5, having the formal and informal derivations of the 
boldface f and the lightface f “track each other” in the obvious way — i.e., 
assembling the boldface version exactly as the lightface version was assembled — 
we obtain f™ = f, and 


11.3.8 Theorem. /f f is a formal primitive recursive function of arity n (in 
Ly), then, for all Gy in N", 


Lov fGn)=b implies + fla)=b 


11.3.9 Corollary. The “implies” in 11.3.8 can be strengthened to “iff”. That is, 
every informal primitive recursive function is strongly term-definable (p. 187), 
indeed by a primitive recursive term. 


Proof. Assume fan) =p. Writing f for f™, assume f(@,) =c 4 b. Then 
alsot f(G,)=Cc; hence | b=C. Now, b ¥ c yields (already in ROB) 
+ ab =C, contradicting consistency of PA’. 


We are reminded that ROB and PA are consistent (since they each have a 
model), a fact that we often use implicitly. 


It is trivial then that each recursive extension PA’ is also consistent, for such 
an extension is conservative. er 


+ We base this assertion, of course, on the existence of a standard model SW, a fact that provides 
a non-constructive proof of consistency. Constructive proofs of the consistency of ROB and PA 
are also known. See for example Shoenfield (1967), Schiitte (1977). 
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II.3.10 Corollary. For every closed primitive recursive termt there is a unique 
néN such that+t =7. 


Proof. Existence is by IL.3.8 (use n = t™), Uniqueness follows from defin- 
ability of x = y in ROB (1.9.39, p. 181) and hence in any extension PA’. 


Pause. Is it true that every closed term in a recursive extension of PA is provably 
equal to a numeral? 


1.3.11 Corollary. Every primitive recursive relation is strongly definable by 
some formal primitive recursive predicate. 


II.4. The Boldface A and & 


II.4.1 Definition. Let La be a language of arithmetic. The symbol %4(La) 
denotes the set of formulas {(Ax).4 :.4 € Ao(Lg)}. 
If the language is Ly, then we simply write 44. 


Usage of boldface type in 4 should distinguish this set of formulas over Ly 
from the set of &,-relations of the arithmetic hierarchy. 


We also define variants of Ag and 4, above. 


II.4.2 Definition. Let Ly be a language of arithmetic. The symbol Ay (La) de- 
notes the smallest set of formulas over Ly that includes the atomic formulas, but 
also the negations of atomic formulas, and moreover satisfies the closure con- 
ditions: If.4 and .# are in Ad (Lav then so are. ZV .2,.4N.2, (Ax) <t.%, 
and (Vx) <;.4 (where we require that the variable x not occur in the term f). 


If the language is Ls, then we simply write At : 


II.4.3 Definition. Let Ly be a language of arithmetic. The symbol Aj(La) 
denotes the smallest set of formulas over Ly that includes the restricted atomic 
formulas (defined below) — and their negations — and moreover satisfies the 
closure conditions: If .4 and .# are in Ag(Ly), then so are.4V .2,.4N.2, 
(Ax)cy.Z and (Vx)<y.4 (where x # y). 

Correspondingly, the symbol 5:4 (La) denotes the set of formulas {(Ax).4 : 
4 € Aj(La)}. 

If the language is Ly, then we simply write Aj and 5}. 


Now, the restricted atomic formulas over Ly arex = y,0= y, fx, = y, 
and Px, for all n-ary function (f) and predicate (P) symbols. 
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The restriction on the atomic formulas above was to have function and predicate 
letters act on variables (rather than arbitrary terms). Similarly, we have used 
(Ax)<y and (Vx)<y rather than (Ax) <; and (Vx)< in IL4.3. 

The superscript “+” in II.4.2 is indicative of the (explicit) presence of only 
positive closure operations (— does not participate in the definition). The same 
is true of the Aj(Ly) formulas. 

It turns out that both Ad (Ly) and Aj(La) formulas are closed under nega- 
tion (in the definition of these sets of formulas the application of “—” has been 
pushed as far to the right as possible). We prove this contention below. The 
introduction of Ay (Ly) and Aj(La) is only offered for convenience (proof 
of I1.4.12 below). Neither symbol is standard in the literature. 


11.4.4 Lemma. For any 4Z€ Aj (La) (respectively, .4€ Ap(Ly)) there is a 
BE Aj (La) (respectively, 2 € Aj(Ly)) such that 7.4.2, where “F” 
denotes logical (pure) provability. 


Proof. We do induction on Ay (Ly) (respectively, Ag (La); the induction vari- 
able is .4). 

Basis. If /@ is atomic (or restricted atomic), then we are done at once. If 
it is a negated atomic (or negated restricted atomic) formula, then we are done 
by Etat 2 37.7. 

Now .# can have the following forms (if not atomic or negated atomic) by 
I1.4.2-I1.4.3: 


Gi) 4=.8V 6. Thent 740 42 A 7@,and we are done by the I.H. 
via the Leibniz rule. 
Gi) 4= 8BA€. Thent 7.40 487 V 7A@,and we are done by the L.H. 
via the Leibniz rule. 
(iii) 4= Ax)ea.Z. Thent 34+ (Vx)<;.%, and we are done by the 
LH. via the Leibniz rule. 
(iv) .4= (Vx)eZ. Thent 3.4 (Ax)<;3.%, and we are done by the 
LH. via the Leibniz rule. 


11.4.5 Corollary. For any.4 € Ag(Ly) there isa. #2 € Aj (La) such that we 
can prove .6++.# without nonlogical axioms. 


Conversely, every 2 € AS (Ly) is a formula of Ag(La). 


11.4.6 Lemma. Let .4 and £2 be in X4(Ly) (respectively, X4(La)). Then 
each of the following is provably equivalent in PA’ to a formula in 44(Ly) 
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(respectively, 14 (La)): 
(i) .4V.B 
(ii) 4@AN.B 
(iii) (Ax).4 
(iv) (Ax)<z.4 
(v) (Wx)<z.%. 


By PA’ over Ly, above, we understand an extension by definitions of PA over 


Ly. (In this connection cf. 11.1.2, p. 207). 


Proof. The proof is a straightforward formalization of the techniques employed 
in the proof of 1.8.48. The only case that presents some interest is (v), and we 
give a proof here. The proof hinges on the fact! 


F (Vx)<(Ay).2 - Au)(Vx)<2(AY cw? 


where w is a new variable. 
The < -direction of the above being trivial, we just prove 


F (Vx)<,(Ay).2  (Aw)(Vx)<:(AW<wF (1) 


by induction on z.+ 


The basis, z = 0 is settled by < 1. Take now (1), with frozen variables, as 
the LH. Add the assumption 


(Vx)<sz(Ay).7 (2) 
We want 
F (Aw)(Vx)<s:(Ay)<wF (3) 
By (2) and specialization, 
bx <zVx=z— (Ay)2% (4) 


SinceFkx << Z—~x¥<zZzVx=zand-x=z—>x <z2Vx =z, () yields 


bkx<z— (Ay).2 (5) 


+ 


If we set. Z= (Ay).%, where. 7 € Ap(Lq), then (Aw)(Vx)<,(Ay<wF is the 44 (Lq) formula 
we want in order to establish (v). 

The reader who has had some axiomatic set theory will notice the remarkable similarity of (1) 
with the axiom of collection (cf. volume 2, Chapter III). Indeed, (1) interprets collection, if we 
interpret the basic predicate “ec” of set theory as the predicate ““<” of arithmetic. 


+e 
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and 
bKx=z— (Ay)? (6) 
By (5) and the I.H. we obtain 
F (Aw)(Vx)<:(Ay<wF (7) 
By (6) we obtain F (Ay). #[x — z]; hence 
F (Aw)(Ay)<w 4 [x — Zz] (8) 


© Pause. Why is (8) true? Well, it follows immediately provided we believe 
F (Ay).2 — (Aw)(Ay)<wF 


To establish the above let (Ay).7 Ly]. Add now.#[c], where c is anew constant. 
Then! c < SecA .#{[c]; hence + (Ay)\(y < Se A.#) (by Ax2) and thus - 
(Aw)(Ay)(y < wA.#%) (by Ax2 again). ® 


Now, arguing by auxiliary constant once more, relying on (7) and (8), we 
add new constants a and b and the assumptions 
(Vx)<:(Ay) <a (7) 
and 
(Ay) <1. [x — z] (8’) 
By the Leibniz axiom (Ax4) and tautological implication, (8’) yields 
Fxr=z—> Fya.F (9) 
On the other hand, we obtain from (7’) 
rx <zZ— (Ay)<aB (10) 
Now set c = S(a + BD). Proof by cases from (9) and (10) yieldst 
Fx <zVx =z (Ay)e..F 
Hence 


F (VxX)x<sz (Ay<c.# 


By Ax2, (3) follows. 


+ Via the obvious + (Ay)<ca.B  (Ay)ee 2 and | (Ay)ep.2 - Ay)ece.%, obtained from 
II.1.17, tautological implication, and 4-monotonicity (1.4.23). 
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11.4.7 Lemma. For any .4€ Ao(Ly) (respectively, .4€ Aj(La)) there 
is a provably equivalent — in pure logic — .2EX1(Ly) (respectively, 
B € (Ly). 


Proof. Let x be a variable that is not free in. 4. Then 


(a) (Ax).4 € 1 (Ly) (respectively, (Ax). 4 € D4 (Ly)), and 
(b) F (Ax).4 4. 


11.4.8 Lemma. For any .4 € Ao(Ly) there isa. € X4(Ly) such that 


Fp,’ aA —- B 
© The absence of a prime from Ao(L,) is intentional. PA’ is as in II.4.6 S 


Proof. In view of II.4.5, we do induction on the definition of At (La) (induction 
variable is .4). 


Basis. Atomic formulas of type t = s: We first look at the subcase tf = y. 
We embark on an informal induction on the formation of t. 


For the basis we have two cases: t = x and t = 0. Both lead to restricted 
atomic formulas (II.4.3), and we are done by II.4.7. Ifnowt = ft,...t,, then 
(one point rule, I.7.2) 

Lt ft, ...ty =yo 
(4x1)... (Gan )(h = x41 A... Nth = Xn AN fX1---Xn =) 
—— ee ————— —— 
LH. LH. AN(La) 


where the x; are new variables. By the I.H. on terms, the Leibniz rule, and II.4.6— 
II.4.7 we are done. 


We can now conclude the t = s case: 
Ft=so(yt=yAs=y), where y is a new variable. 
Basis. Atomic formulas of type Pt, ...t,: Done by 


EF Pt,...ty 
(Ax)... (xn) = x1 AN... Ate = Xn AN PX... Xn ) 
— 
Aj(Lad) 


Basis. Negated atomic formulas of type st = s: 


Fat=s oo (Ax)\Gyt=xAs=yAr-7x=y) 
—\_S > 


Aj(Lw 
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Basis. Negated atomic formulas of type ~Pt, ...ty: 
FaAPt,...t) 
(Ax)... (Gx) = x1 AN... Ate = Xn A APX1...Xn) 
SS 
Ag(La) 


The induction steps are for V, A, (Ax)<:,(Vx)<; and follow at once from 
11.4.6 (i), (ii), (iii), (iv), and (v), and (Ax)<;.4 <= (Az)z =t A (Ax)ez.4). 


11.4.9 Corollary. For any 4 € %4(Ly) there is a provably (in PA‘) equivalent 


Proof. 11.4.8 and II.4.6(ii7). 


11.4.10 Lemma. Let PA’ over Ly be a recursive extension of PA. Then for 
each .~4 € X4(Ly) there is a formula. 2 € 44(Ly) such that Fpy 4B. 


Proof. It suffices to prove that if Ly is obtained from Ly by the addition of 
either a single function or a single predicate symbol, and the defining axiom 
was added to a recursive extension T (over Ly) of PA yielding PA’, then for 
each .4 € 44(Ly) there is a formula.7 € 44(Ly) such that py .4 oF. 

In view of II.4.9 and II.4.6(ii7) it suffices to prove this latter claim just for 
all. Z € Aj(Lq). We do induction on the definition of Aj(Lq) (induction 
variable: .4). 

Basis. Restricted atomic cases and their negations. All cases are trivial 
(11.4.7), except fx, = y, afX, = y, Px,, and ~Px,, when f, or P, is the 
new symbol. 


Say we have the case a fx, = y, where 
Xn = (uz).B 
and.7 € Ao(La). 
& Note the absence of a prime from 2 in Ag(La). “F” below is “Fp,’”. 
Thus, 
b fi, =y¥ BIA Wa) Fe] 
and therefore 
F afin = yo Gury = wA.4[v] A (V2)<w>-7 ED) 


The right hand side of “<>” above is in 444(La). 
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If on the other hand P is the new symbol, then we have 
PX, .2 
and .7% € Ao(La), and this rests the case via II.4.7. 


The induction steps are (II.4.3) for V, A, (Ax)<,, (Vx)<, and follow at once 
from II.4.6 (i), (ii), @v), and (v) and the standard technique of eliminating 
defined symbols (at the atomic formula level). 


11.4.11 Corollary. Let PA’ over Ly be a recursive extension of PA. Then for 
each 4 € X4(Lq) there is a formula #2 € X(Lm) such that Fpy .4 <> 2B. 


Proof. 11.4.10 guarantees a @ € ¥14(Lsy) such that py .4 <> @. Now II.4.9 
provides the .7 € 4 (Ls) we want (recall that PA’ extends PA). 


II.4.12 Theorem. Let PA’ over Loy be a recursive extension of PA, and .74 a 
sentence in 44(Lyy). Then Esy —4 implies that Fpy 7%. 


& Or, “any really true sentence of 44(Ls7) is provable”. 


Proof. By 1.4.11 there is a sentence @ € (La) (note the absence of a prime 
from Yt) such that 


Hpy 60 & (1) 
Thus (do you believe this?) 
Ey 60 & 
Hence 
Fo 


Now, there is a formula .#(x) € Ag(Ls) such that' 
6 = (Ax).7 (x) 
Hence 
Eo (Ax).2(x) 
Therefore, taking reducts, 


Em (Ax)A (x) (2) 


+ = means string equality. 
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By (2), (B2(n))™ = t for some n € N; hence (p. 171) 
(A(n))"=t (3) 


By (3), and an easy induction on the structure of .4(n) (Definition II.4.3), 
following the steps of the proof of 1.9.48 (p. 185), 


FRop -7(1) 


Hence F pa: (Ax). (x) by Ax2. By (1), F pa .4. 


I1.4.13 Example. Equipped with I1.4.12, one can now readily answer the 
“why’’s that were embedded in Example IT.2.7. 


1.4.14 Remark. What II.4.12 above — and, earlier on, [1.3.8 — do for us, prac- 

© iy. is to eliminate the need for formal proofs of “true” 44-sentences . 7, 
proofs that would be normally carried out in tedious detail within some recur- 
sive extension of PA. This is achieved by a two-pass process: 


(1) We somehow convince ourselves that. 4 is “true”, i.e., true in Jt or in some 
expansion thereof that accommodates whatever defined symbols are used 
inside the sentence. 

(2) We then invoke (Meta)theorem II.4.12, which guarantees .4, without our 
having to write down a single line of a formal proof! 


The reader may object: Obviously all the work was shifted to item (1). But 
how does one “prove” informally the “truth” of a sentence? Does it not neces- 
sitate as much work — in an informal deduction‘ — as a formal proof does? For 
example, that /cm{2, 3, 4,5} = 60 is definitely obvious, but just proclaiming 
so does not prove this fact. How do we know that it is true? 

The explanation is not mathematical, but it rather hinges on the sociology of 
informal proofs. An informal proof, viewed as a social activity, ends as soon as 
a reasonably convincing case has been made. Informal proofs are often sketchy 
(hence shorter than formal proofs), and the participants (prover and reader) 
usually agree that a certain level of detail can be left untold, and are also 
prepared to accept “the obvious” — the latter being informed by a vast database 
of “real” mathematical knowledge that goes all the way back to one’s primary 
school years. 


¥ Ofcourse, one would never think of establishing truth in mathematical practice purely by semantic 
means, for this would involve messy infinitary arguments, while deductions, even informal ones, 
have the advantage of being finitary. 

t Weare guilty of often leaving details for the reader to work out in our formal proofs. This dilutes 
the formal proof by an informality that we exercise for pedagogical reasons. 
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Surely sometime in the past we have learnt how to compute the /cm, or the 
gcd, of a set of numbers, or that 13 + 7 = 20 is “true”. We retrieve all this 
from the “database”. Of course, a formal proof in ROB of, say, 13 + 7 =20is 
another matter, and is certainly not totally painless, as the reader who has read 
the lemma on the definability of x + y = z (1.9.41, p. 181) will agree.t 

Thus, deductions in “real” mathematics need only be as long (i.e., as de- 
tailed) as necessary for their acceptability. Indeed, one often finds that the level 
of detail included in various informal proofs of the same result is inversely pro- 
portional to the mathematical sophistication of the targeted audiences in each 
presentation. By contrast, formal proofs are, by definition (cf. 1.3.17, p. 37), 
audience-independent. 


Back to practice: Having performed step (1) above, we can now do one of 
two things: We can decide not to cheat by using step (2), but write down instead 
a formal argument that translates the informal one of step (1), treating the latter 
merely as a set of “organizational notes”. 

Or, we can take the easy way out and invoke step (2). This avenue establishes 
the formal result entirely metamathematically. We have shown that a formal 
proof exists, without constructing the proof. 

The approach is analogous to that of employing Church’s thesis to informally 
“prove” results in recursion theory. This thesis says: “If we have shown by an 
informal argument that a partial function f is computable in the intuitive sense — 
for example, we have written informal instructions for its computation — then 
this f is partial recursive, that is, a formal algorithm for its computation exists 
(e.g., an algorithm formalized as a Turing machine, or as a Kleene schemata 
description)”. We do not need to exhibit this formal algorithm. 

Compare with “If we have shown by an informal argument that a sentence 
_@ among the types allowed in II.4.12 and II.3.8 is true in the standard struc- 
ture, then there exists a proof for this .4 in (some conservative extension of) 
PA”. 

However, even though Church’s thesis and II.4.12 and 11.3.8 are applied 
in a similar manner, there is a major difference between them: The former 


+ 


It is normal (sloppy) practice to invoke “general” results where it would have been more appro- 
priate, in order to avoid circularity, not to do so. For example, we may claim that (for any fixed a 
and b in N) the “true” sentence a + b=a+tbis provable, by invoking II.4.12 or II.3.8. Correct 
practice would have been to say that the provability of the sentence is due to 1.9.41, since the 
latter was used towards the establishment of II.4.12 and II.3.8. But this puts a heavier burden on 
memory. 

For example, it is held that part of the reason that Gédel never published a complete proof of his 
second incompleteness theorem was that the result was readily believed by his targeted audience. 
This is entirely analogous to what a computer programmer might do. He would first develop 
pseudocode (an informal program) towards the solution of a problem. He would then translate 
the pseudocode to a formal computer program written in the appropriate programming language. 


as 


wo 
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is only a belief based on empirical evidence,’ while the latter two are meta- 
theorems. 


Unlike “true” existential sentences, “true” universal, or IT,, sentences (of 
arithmetic) are not necessarily provable, as Gédel’s first incompleteness theo- 
rem tells us. The II,-formulas are those of the form 7.4 where .4 is %y. 

For example, if T is the formal Kleene predicate, then there are infinitely 
many “true” II,-sentences of the form —(4y)T (@, a, y) that are not provable 
(cf. (3) in 1.9.37). 


IL5. Arithmetization 


We now resume and conclude the discussion on the arithmetization of formal 
arithmetic(s) that we have started in Section I.9 (p. 168). The arithmetization is 
really a package that includes the Gédel numbers of terms and formulas on one 
hand, and, on the other hand, a finite suite of formal predicates and functions 
introduced to test for properties of Gédel numbers (e.g., testing whether x is a 
number for a formula, term, variable, etc.). 

This package will be totally contained inside an appropriate recursive exten- 
sion of PA over the appropriate Ls, that contains all the tools (such as “(...)”, 
etc.) needed to form Gédel numbers as on p. 168 and all the additional test 
predicates and functions (and their defining axioms). 

Gédel numbers, “"... '”, will be certain closed terms over Lyy rather 
than “real” natural numbers from N. We will rely on the numbering given on 
p. 168; however, we shall now employ the formal minimum coding (...) given 
by (FC) on p. 239 and also make the following trivial amendment to notation: 
Every occurrence of a specific natural number n inside (...) brackets is now 
changed to the term denoted by 7 (numeral). 


We next introduce the testing tools, that is, a finite sequence of atomic 
formulas and terms (introducing appropriate predicate and function symbols in 
a manner that respects the rules for recursive extensions) that enable the theory 
to “reason about” formulas (using Gédel numbers as aliases of such formulas). 


+ Tn some contexts — such as when partial function oracles are allowed — there is evidence to 
the contrary: If L(x, a, y) is the relation that says “program x with input the oracle a has a 
computation of length less than y”, then this is intuitively computable: Just let program x crank 
with input @ and keep track of the number of steps. If the program halts in fewer than y steps, 
then stop everything and return “‘yes”; otherwise, if x has already performed the yth step, stop 
everything and return “no”. Now, if a and £ are partial functions, a C B does not guarantee that 
L(x, a, y) and L(x, B, y) yield the same answer, that is, L is non-monotone, or inconsistent. In 
most foundations of computability inconsistent relations such as L are not allowed, i.e., L is not 
formally computable in such theories; hence Church’s thesis fails with respect to such theories. 
This particular “negative evidence” is eliminated if we use a different foundation of computability 
that was introduced in Tourlakis (1986, 1996, 2001a). Now L is formally computable. See also 
Kalmar’s (1957) objections. 
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In each case we precede the definition with a comment stating the intended 
meaning. 
Var (x,i) holds if “x ="v;”’. 
Var (x,i) — Seq(x) Alh(x) =i +4 (x)o = SO 
AX)s0 ="CA (x)i + sso =") (Var) 
A(VZ) <in(x) (SO <zAz<i+3—-(%), ='O 


er” prefer to write, say, “... ='("” rather than“... = (0, 9)”. 


Fune(x,i, j) holds if “x =" f/™. 


Func(x,i, j)  Seq(x) Alh(x) =i + j + 6A (x)p =2 
A@n="CAQDGEe=) 
Aw). = 
A(Vz)<ina(SO<z2Az<itjg+5 
Anz =i +3 
— (x), =" A’) 


(Func) 


Pred(x,i, j) means “x =" PJ”. 


Pred(x,i,j) — Seq(x) Alh(x) =i tj +6A (x) =3 
AX) = "CA M443 = 
A@)43 =" #" 
A(V2) cinxy(S0 <zZ2Az<itj+5 
A-z =i +3 
> @),="O 


(Pred) 


Before we proceed with terms and other constructs we introduce the lower- 
case versions of the above predicates: 


var (x) <> (Fi)<,Var(x,i) holds if “x = '"v; ', for some i” 
funce(x,n) — (Si)<,Func(x,i,n) holds if “x =" "1, for some i” 


pred(x,n) <> (ci)<;Pred(x,i,n) holds if “x =" P""’, for some i” 


Term is a new function symbol introduced by course-of-values recursion 
(Theorem II.3.3, p. 251) below. Term("t ') = 0 means “t is a term”. 


On the other hand, the statement “Term(x) = 0 implies that ‘x = "t™ for some 
term?’ ” is nota fair description of what “Term” does. That is, after establishing 
that Term(x) = 0, we can only infer that x “behaves like” the Gédel number 
of some term, not that it is the G6del number of some term. See Lemma I.6.24 
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(p. 297) later on, and the comment following it. If “implies” is not apt, then we 
cannot say “means” either, for the latter is argot for equivalence. We can say 
“holds if”, though. 


Similar caution must be exercised when interpreting the remaining functions 
and predicates introduced in this section. ® 


0 if var(x) Vx = (0, SO) 
V Seq(x) A (Ay)<x(Uh(x) = Sy A func((x)o, y) 
A (Vz)<y Term((x)s,) = 0) 

SO otherwise 


Term(x) = 


(Trm) 
AF('.4') says “4Z is atomic”. 


AF (x) (Ay) ex (AZ<,(Term(y) = 0A Term(z) = 0 
Ax = (" =',y,z)) (Af) 
VSeq(x)A (Ay)<x (h(x) = Sy A pred((x)o, y) 
A (W2)<yTerm((x)sz) = 0) 
WFF is a new function symbol introduced by course-of-values recursion 
below. WF F(".Z"') = 0 means “4 € Wf”: 


0 if AF(x) V Seq(x) A (I(x) =3 A [(x)o =" V7 
A WFF((x)so) = 0 A WFF((x)sso) = 0 
WFF(x) = V (x)o = "SA var ((x)s9) \ WEF ((x)sso) = 0] 
VIh(x) =2 A (x)p = 31 A WFF((x)s0) = 9) 
SO otherwise 
(Wf) 
We now introduce a new function symbol, Free, such that Free(@i, x) = 0 


intuitively says “if x is the Godel number of a term or formula E, then v; occurs 
free in E” (see 1.1.10, p. 18). 


0 if Var(x,i) V 
(AFG) V (Term(x) = 0A Gy) <x fune((x)o,9))) A 
(Ay)<iny (0 < y A Free, (x)y) = 0) V 
WFF(x) = 0A (&%)p ='73'A FreeGi, (x)s0) = 90 V 
Free(i,x) = WFF(x)=0A QQ) =" VIA 
(FreeG, (x)so) = 0 V Free(i, (x)sso) = 0) V 
WFF(x)=0A (x) ='A'A 
aVar ((x)s0,i) A Free@, (x)sso) = 0 
SO otherwise 
(Fr) 
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We next introduce Gédel’s substitution function, a precursor of Kleene’s S-m-n 
functions. We introduce a new function symbol by course-of-values recursion. 
Sub(x, y,z) will have the following intended effect: 


Sub(t[vj,i,°s1) ="t[s and Sub(".4[v,}, 7,8) =". Asp 


However, in the latter case, the result will be as described iff s is substitutable in 
v;. Otherwise the result 0 will be returned (a good choice for “not applicable”, 
since no Gédel number equals 0). 
This will make the definition a bit more complicated than usual. Our defi- 

nition of Sub below tracks 1.3.11, p. 32. 

z if Var(x,i) 

x ifx = (0,50) 

x if var(x) A 7Var(x,i) 

(1) (Seq (w) A Th(w) = Th() A (wi) = (0 A 


(Wy) tmay(Y > 0 + (wy = Sub((x)y,i,2))) 
if (Term(x) = 0A (Ay)cx func((x)o, y)) V AF (x) 
((x)o, Sub((x)s05i,Z)) 
if WFF(x) =0A(x) ="AI'A 
Sub(x,i,z) = Sub((x)s0,1,z) > 0 
((x)o, Sub((x)s0,i,Z), Sub((x)ss0,%,Z)) 
if WFF(x)=0AM@)='VIA 
Sub((x)s0,4,2) > 0 A Sub((x)sso,t,z) > 0 
x if WFF(x) =0A (&)o ='S'A Var ((x)s50, i) 
((x)o, (X)s0, Sub(X)ss0, i, Z)) 
if WF F(x) = 0A (x)p ='43'A Var ((x)50, i) 
A Sub((x) 5505452) >0A 
(AD) <(x)s0 Var ((x)s0, 7) A Free(j,z) > 9) 
0 otherwise 
(Sub) 
A two variable version, the family of the “lowercase sub;”, for i € N, is also 
useful: 
sub;(x,z) = Sub(x, i, Z) (sub;) 
For any terms ¢t[v,,],s and formula .Z[v,], we will (eventually) verify that 
F suby(tlom]',"s ') ="t[s]? and + suby(".Z[vm]',"s') =" 4[s]), as- 
suming that s is substitutable for v,, in 7. 


+ In the literature — e.g., Shoenfield (1967), Smoryriski (1978, 1985) — one tends to define Sub with 
no regard for substitutability, i.e., pretending that all is O.K. One then tests for substitutability 
via other predicates or functions that are subsequently introduced. 
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In the next section we introduce (informally) yet another special case of sub 
that has special significance towards proving the second incompleteness theo- 
rem. For now, we continue with the introduction of predicates that “recognize” 
Godel numbers of logical axioms. 

The introductory remarks to Section II.1 announced that we will be making 
an amendment to what the set A of logical axioms is, in order to make the 
arithmetization more manageable. 

Indeed, while keeping the schemata Ax2—Ax4 the same, we change group 
Ax1 — the “propositional axioms” — to consist of the four schemata below: 


(1) .4V.4>.4 
(2) 4. 4V.B 
3) 4V BBVA 
(4) (47>. B)R(FV. 4 EVA). 
Of course, the new axiom group Ax] is equivalent with the old one on p. 34, 
as Exercises I.26-L.41 (pp. 193-195) in Chapter I show. 


Thus we introduce predicates Prop; (i = 1, 2,3, 4) such that Prop;(x) is 
intended to hold if x is a G6del number of a formula belonging to the schema (7), 
i = 1, 2,3, 4. Here is one case (case (4)); cases (1)-(3) are less work. 
Propa(x) <> (Ay)<x(Az)<x (Aw) <x (wrFuy = 0A WFF() =0/A 

WFF(w)=0A 


rafPaa(rrnah (40 ae a "sw2)))) 


To simplify notation we have used the abbreviation “(" — ',a,b)” for the 
fully explicit “(7 V 7, (" a 1,a),b)”. This will recur in our descriptions of the 
remaining axiom schemata. cr 


We finally introduce 
Prop(x) <— Prop,(x) V Prop2(x) V Prop3(x) V Propa(x) (Prop) 


which is intended to hold if x is a Gddel number of an axiom of type Ax1 (as 
the Ax1 group was amended above). 


For Ax2 we introduce SubAx. 
Sub Ax(x) holds if x = ".4[z — t] — (Az).4" for some formula. 4, vari- 
able z, and term ¢ that is substitutable for z in .4. Thus, 
Sub Ax(x) > (By) <r(Bz) <x (Sw) <x (Gi) ce(WFF(y) = 0A 
Term(w) = 0A Var(z,i) A Sub(y,i, w) > 0A (SubAx) 
x = ("+ 7, Sub(y,i,w), (/3,2,9))) 
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Note that Sub(y,i, w) > 0 “says” that the term with Gédel number w is sub- 
@ stitutable for v; in the formula with Gédel number y. We will verify this in the 

next section, however the reader can already intuitively see that this is so from 

the definition of Sub. © 


For the axioms Ax3 we introduce Id Ax: 
Id Ax(x) holds if x = "y = y ' for some variable y. Thus, 
Id Ax(x) <> Seq(x) A lh(x) = SSSOA 
(x)o =" = 'A (*)s0 = (*)ss0 A var ((x)s0) 
Finally, for Ax4 we introduce EqAx. EqAx(x) holds if x ="t=s > 
(A4[t] <— .4[s])" for some formula .4 and substitutable terms ¢ and s. Thus, 
Eq Ax(x) <> (By) <x(Fz) <r(BW) <x) <x Gicx(WFF(Y) = 0A 
Term(w) = 0A Term(v) = 0A Var(z,i) A 
Sub(y,i, w) > 0A Sub(y,i,v) > OA 
x=( 31 = 1,w,v),(" © |, Sub(y,i, w), Sub(y,i, v)))) 
(EqAx) 


(IdAx) 


Thus, we introduce LA by 
LA(x) <— Prop(x) V SubAx(x) V Id Ax(x) V Eq Ax(x) (LA) 
LA(".4') means “4 € A”. 
We have two rules of inference. Thus, we introduce MP by 


MP(x,y,zZ) @ WFF(x) =0A WFF(z)=0Ay = (' > 1,x,2) 


(MP) 
We also introduce J-introduction, ET, by 
EI(x, y) = (AU) <x (Av) <x (AZ) <x (Gi) <x (x —= ‘o _- M, Uu, v) 
A WFF(u) =0A WFF(v) = 0 
(EI) 


A Var(z,i) A Free(i,v) > 0 

Ay=( >) ("37,z,u), v)) 
We will use one more rule of inference in the definition of Proof below. 
This rule, substitution of terms, is, of course a derived rule (1.4.12, p. 44), 
but we acknowledge it explicitly for our future convenience (Lemma II.6.27 — 
via II.6.18 and II.6.20). Thus, we introduce SR. SR(x, y) holds if x =".4" 
and y = '.4[v; <— t]" for some term ¢ substitutable in v;: 


SR, y)— WEF) =0A WFF(y)=0A 


: : (SR) 
(Si )<x(az)<y(Term(z) = 0A y = Sub(x,i,z)) 
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11.5.1 Remark. In introducing MP, ET, and SR above we have been consistent 
with our definition of inference rules (1.3.15, p. 36) in that the rules act on 
formulas. 

It is technically feasible to have the rules act on any strings over our alphabet 
instead—e.g., view modus ponens as acting on arbitrary strings. Zand (.4 — .#) 
to produce the string .#. Indeed the literature, by and large, takes this point of 
view for the arithmetized versions above, defining, for example MP’ just as 


MP'(x,y,z) y= ( > 1,x,2) (M P’) 


Still, informally speaking, a proof built upon the primed versions of the rules of 
inference is composed entirely of formulas! — as it should be — for these rules 
produce formulas if their inputs are all formulas, and, of course, a proof starts 
with formulas (the axioms). 


We are ready to test for proofs. Fix a set of nonlogical axioms I. 


© Suppose that we already have a formulaT such that D(".4") means “.4 € 1”. © 


Then, we may introduce the informal abbreviation Proof such that 
Proof (x) holds if x = (".4,7,".4,7,...) where .4,,.4,,... is some [- 
proof. 

Proof (x) stands for Seq(x) Alh(x) > OA 
(Vi) <ine)(LA((x)i) V P()i) V 
(Ap) <i(Ab)<iMP((x)j, )k, i) V (Proof) 
(Aj)<iEI((x);, (x)i) V 
(Ap) <i SR(x);, (*)i)) 
Note that this introduction of Proof is modulo T, but most of the time we will 
@ not indicate that dependence explicitly (e.g., as Proof), since whatever we 
have in mind will be usually clear from the context. 

Gédel’s theorem is about theories with a “recognizable” set of axioms, i.e., 
recursively axiomatized theories. However, it is also applicable if we take the 
somewhat more general assumption that I" is a semi-recursive set, which for- 
mally means that we want the formula I to be 4 over the language where 
we effect our coding. Informally, this means that even if we cannot recognize 
the axioms, nevertheless we can form an effective (algorithmic) listing of all of 
them; Gédel’s results still hold for such theories.? 


1 As an easy informal induction on the length of proofs shows. 

¥ Itis easy to see that the the set of theorems of a theory with a semi-recursive set of axioms is 
also semi-recursive by closure of semi-recursive relations under (Ay). This immediately opens 
any such theory for arithmetic to the incompletableness phenomenon, since the set of all true 
sentences is not semi-recursive. 
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The reader will note that PA is much “simpler” than semi-recursive. Indeed, 
it is not hard to show that if ! = PA, then the formula I is primitive recursive.‘ 
In this case, I(x) will be equivalent to a lengthy disjunction of cases. For 
example, the case corresponding to axiom +1, z + 0 = z, is given by the (Ag) 
formula 


(Az)cx(var(g) Ax = (" = 7, (7 + 7,z,0),z)) 
We leave the details to the reader (Exercise II.15). 
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In this section we will find a sentence in Ls; that says “I am not a theorem”. 

We will also show that certain derivability conditions (the Ableitbarkeits- 
forderungen,: of Hilbert and Bernays (1968)) hold in some appropriate ex- 
tension, C, of PA. These conditions talk about deducibility (derivability) in 
appropriate consistent extensions I of PA. At the end of all this, to prove the 
second incompleteness theorem becomes a comparatively easy matter. 


We begin by assessing the success of our arithmetization. We have made 
claims throughout Section IL.5 about what each introduced predicate or function 
meant intuitively. We want to make precise and substantiate such claims now, 
although we will not exhaustively check every predicate and function that we 
have introduced. Just a few spot checks of the most interesting cases will suffice. 


First off, we fix our attention on an arbitrary consistent extension, I’ of PA, over 
© some language Ly, where I is X11 over the language of C below. 

We also fix an extension by definitions: of PA, where all the predicates 
and functions that we have introduced in our arithmetization along with their 
defining axioms, reside. We will use the symbol C (“C” for Coding) for this 
specific extension. The language of C' will be denoted by Ly... 


As a matter of fact, in the interest of avoiding circumlocutions — in particular 
when proving the Hilbert-Bernays DC 3 later on (II.6.34) — we allow all the 
primitive recursive function symbols (II.3.6, p. 253) to be in Ly. and, corre- 
spondingly, all the defining axioms of those symbols (1.e., through composition, 
primitive recursion, or, explicitly, as definitions of Z and U;") to be included 
inc. 


+ Cf. IL3.6, p. 253. 

= The Ableitbarkeitsforderungen that we employ are close, but not identical, to the originals in 
Hilbert and Bernays (1968). We have allowed some influence from L6b’s modern derivability 
conditions, which we also state and prove, essentially using the old Ableitbarkeitsforderungen 
as stepping stones in the proof. Once we have Lob’s conditions, we employ them to prove the 
second incompleteness theorem. 

8 Recall that extensions by definitions are conservative. 
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We note that C is a recursive extension of PA in the sense of p. 220, since 
Proof (introduced in the previous section), and Deriv and © (p. 281 and 
p. 280 below) are not introduced as predicates, but rather are introduced as 
informal abbreviations. 


Note that the languages Ly and Ly, may, a priori, extend the language 
of PA, Ls, in unrelated directions, since the former extension accommodates 
whatever new nonlogical axioms are peculiar to I’, while the latter simply 
accommodates symbols (beyond the basic symbols (NL) of p. 166) that are 
introduced by definitions (see, however, II.6.32, p. 303). Thus, I may fail to be 
a conservative extension of PA, unlike C. 


When we write “-” in this section, we mean “Fe” (or “Fe plus some 
‘temporary’ assumptions”, if a proof by deduction theorem, auxiliary constant, 
etc., has been embarked upon). Exceptions will be noted. 


The informal symbols “Proof”, “Deriv”, and “©” (the latter two to be in- 
troduced shortly) will abbreviate “Proof”, “Derivy”, and“©r” respectively. & 


Now (equipped with the tools of Sections I.8-1.9 and II.3.6-I1.3.11) it is 
easy to verify that all the atomic formulas and terms that we have introduced 
in Section II.5, to “recognize specified sets of Gddel numbers” — except for the 
formula Proof, because of our assumptions on I’ — are primitive recursive. For 
atomic formulas this means that the corresponding characteristic terms! are. 


Pause. Is that so? How about the definition of Sub (p. 268)? It contains un- 
bounded search. 


In particular, the lightface versions of each such formula or term (e.g., 
Var(x,i), Term(x), WF F(x)) —i.e., the informal versions obtained by repli- 
cating the formal definitions in the metatheory — are in BR, or BR. Applying 
Theorem II.3.8 and its Corollary I1.3.9, we have that the boldface version in 
each case both defines semantically (in Lyx.) and strongly defines (in C) the 
corresponding lightface version. 


Since every Gédel number ¢ is a closed primitive recursive term, II.3.10 
implies that for any such ¢ there is a unique n € N such that t = 71. 


We can actually obtain a bit more: If we let, for the balance of this section, gn(E) 
denote the informal (lightface) Godel number of a term or formula E,? — that is, 
a “real” number in N — while ' E | continues to denote the formal version — i.e., 


¥ Cf. 1.1.25 (p. 220) and p. 222. 

= An arbitrary term is generically denoted by ¢ or s in the metalanguage; a formula, by a calli- 
graphic uppercase letter such as. Z . The “E” here is a compromise notation that captures (in the 
metalanguage) an arbitrary term or formula, that is, an arbitrary well-formed expression (string). 
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a closed term in Ly, — then (by II.3.8-I1.3.9) 
| TE = gn(B) OS 


To see why (1) is true, consider the Gédel numbering that we have developed 
starting on p. 168. In the interest of clarity of argument, let us introduce the 
functions f; @ = 1,...,11) below — as well as their lightface versions, 
the latter (not shown) by the same (although lightface) defining equations. 
Note the deliberate choice of distinct variables v; (i = 1,..., 11): 


fi(vr) = (0,01) 
fx(v2z) = (0, v2) 
fa(v3) = (0, v3) 
fava) = (0, v4) 
fs(vs) = (0, vs) 
f(v6) = (0, v6) 
filv7) = (0, v7) 
fa(vs) = (0, vs) 
fo(vo) = (0, v9) 
fio(vi0) = (9, v10) 
fiu(vi) = (0, v1) 
Then any formal (respectively, informal) Gédel number ' E' (respectively, 
gn(E)) is a closed instance of h[v1,..., vis] (respectively, of h[v;,..., vir]) 
for some appropriate formal (respectively, informal) primitive recursive h (re- 
spectively, h). The closed instance is obtained in a specific way: Each 2; (re- 
spectively, each v;) is being replaced by i (respectively, i). 
As we assume that the formal and informal definitions “track each other” — 
ie., have identical primitive recursive derivations except for typeface — then 
h=h* (1’) 
Now (1) translates to 
LAfi,..., 11J=(All,..., 11)” 
where ‘(A[1,..., 11])~’ denotes that “~’ applies to all of “h[1,..., 11]’; which 
holds by (1’) and II.3.8. 


© 11.6.1 A Very Long Example. It is trivial that, for any i € N, Var(gn(a;j), i) 
is true.’ Thus, both 


Lm, Var(gn(v;), 1) 


¥ Do you believe this? See II.4.14 for a discussion of such outrageous claims. 
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and 
i Var(gn(v;), 1) 
hold, for all i € N.* By (1), p. 274, and Ax4, 
Em. Var("v",i) 
and 
oe Var ("¥; 7,1) 
hold for all i ¢ N. From the last two and Ax2 follow, for all i € N, 
Em. var("v;") 
and 
F var (0; ') 


Conversely, if + Var(n, i), then Var(n, i) is true by II.3.9, and therefore, tri- 
vially, gn(v;) =n through examination of the lightface definition (Var) (iden- 
tical to the boldface version given on p. 266). Hence 


Thus (by (1), p. 274) 
Kooy lan (2) 


In other words, + Var(7i, i) implies that 7 is provably! equal to the (formal) 
Gédel number of the variable v;. 

Similarly, | var(7) implies that 7 is provably equal to the formal Gédel 
number of some variable. 


We next verify that Term behaves as intended. Indeed, by induction on 
terms, t, we can show in the metatheory that Term(gn(t)) = 0 is true. For ex- 
ample, if t = ft,...t,, then the I.H. implies that Term(gn(¢;)) =0 is true, 
fori=1,...,n. Given that gn(t) = (gn(f), gn(t1), ..., gn(tn)), and consult- 
ing (Trm), we see that indeed, Term(gn(t)) =0 is true. The basis cases are 
covered as easily. Thus (II.3.8), 


F Term(gn(t)) = 0 


+ We recall that the metasymbol 7 denotes a (formal) term. On the other hand, i in 2; is part of the 
name. 

tinc. 

8 This is the formal definition on p. 267. The informal definition is the lightface version of (Trm). 
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hence, by (1) and Ax4 
+ Term(t')=0 


Conversely, + Term(n) = 0 implies that Term(n) = 0 is true. Now an infor- 
mal (i.e., metamathematical) course-of-values induction on n easily shows that 
n = gn(t) for some term ¢, and thus we obtain "t! = 7, exactly as we have 
obtained (2). For example, say that Term(n) = 0 because the third disjuncti in 
the top case of the definition (Trm) (p. 267) is true, namely, 


Seq(n) A Ay)en@h(n) = y+ 1A func((n)o, y) 
A(VZ)<yTerm((n)z41) = 0) 


Let y = k <n bea value that works above, i.e., such that the following is true: 
Seq(n) \ dh(n) =k+1A func((n)o, k) A (Vz) <¢Term(n)241) = 0) 


By the I.H.! there are terms 4,...,¢; such that gn(t,) = (n), forz = 1,...,k. 
Moreover, the truth of func((n)o, k) implies, analogously to the case of Var, 
that there is a function symbol f, of arity k, such that gn(f) = (7)o. Therefore, 


n = ((n)o, (M)1,-- +, (x) 
= (gn(f), gn), -.-, gn(te)) 
= gn(fty...t,) 
We have verified that + Term(7) = 0 implies that 7 is provably equal to the 
(formal) Gédel number of some term f. 

One similarly verifies that WF'F behaves as intended. We next verify that 
Free and Sub behave as intended. 

Suppose that v; appears free in.4. Referring to the lightface definition of 
Free (which is structurally identical to (Fr), p. 267) and using induction on 
_@ or t — as the case may be — it is easy to show that Free(i, gn(.4)) = 0 or 
(correspondingly) Free(i, gn(t)) = 0. Thus, one can prove (by (1) and II.3.8— 
I1.3.10) 

EF Free(i, "4')=0 
or (correspondingly) 


bE Free(i, ty) =0 


Conversely, assume that + Free(i, n) = 0 for some i, n in N. Thus, Free(i, n) = 
0 is true. A course-of-values induction on n (over N) shows that n = gn(E) 


+ A disjunct is a member of a disjunction. 
qr Seq(n), theni < h(n) > (n); <n” is crucial here. 
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(where E is some term or formula) and v; is free in E. For example, if 
Free(i, n) = Ois true because the last disjunct in (Fr) (p. 267) obtains, namely, 


WFF(n) = 0A (n)o = gn(A) A -Var((n),, i) A Free(i, (n)2) = 0 
then the behaviour of W FF ensures that, for some formula . 4, 
n= gn(.4) (i) 


Moreover (by (i) and properties of WF F), n= ((n)o, (n)1, (%)2); hence for 
some j € N and formula.#, 


(n)o = gn( A), (n) = gn(vj) and .4= ((Avj).%) 


By -Var((n);, i), wehavei ~ j. By 1.1.10 and the LH. (by which the conjunct? 
Free(i, (n)2)=0 ensures that v; is free in.7) it follows that v; is free in .7Z. 
Thus, by (1) (p. 274) and (i), "4" = nin this case, and hence (by assumption 
and Ax4) 


EF Free(i, "4 ')=0 
All in all, using (1) for the if part, 


vi isfreeint iff b+ Free(i, thy =0 
vy; isfreein.4 iff F Free(i, "4AYVN=0 


Turning now to Sub, we validate the claims made on p. 268. 
Let at first ¢ and s be terms. It is easy (metamathematical induction on ¢) to 
see that 


Sub(gn(t), i, gn(s)) = gn(tlv; — s]) (3) 


is true. Indeed (sampling the proof), lett = ft, ...t,. Referring to the defini- 
tion of Sub (mirrored on p. 268 for the boldface case), we obtain 


Sub(gn(t), i, gn(s)) = (gn(f), Sub(gn(tr), i, gn(s)), ..-, 
Sub(gn(tn), i, gn(s))) 


Using the IH. on ¢, the above translates to 
Sub(gn(t), i, gn(s)) = (gn(f), gn(tilvi —s]),..., gntalvi — s])) 


Since gn(tlu; —s]) = (gn(f), gnily; —s)),..., gn(Glv; —s])), (3) 
follows. 


¥ A conjunct is a member of a conjunction. 
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Let next .4 be a formula, and s be a term substitutable for v; in 4. By 
induction on. 4 we can show in the metatheory that 


Sub(gn(.), i, gn(s)) = gn4[v; —s)) (4) 


is true. Sampling the proof of (4), let us consider an interesting case, namely, 
4 = (Avj).%, where i 4 j. By the I.H. on formulas, 


Sub(gn(.%), i, gn(s)) = gn( Av; — s]) (5) 
By 1.3.10-1.3.11,.A[v; — s] = uj) Alu; — s]). Thus, 
gn(.4 [vj — s]) = (gn(A), gn(v), gn( Alvi — s))) (6) 
By the definition of Sub, 


Sub(gn(4), i, gn(s)) = (gn(A), gn(v;), Sub(gn(Z), i, gn(s))) 


Thus, (5) and (6), yield (4). 
Here is the other interesting case, where . 4 is a formula, and s is a term that 
is not substitutable for v; in. 4. By induction on .4 we can show this time that 


Sub(gn(.4), i, gn(s)) = 0 (7) 


We just sample the same case as above. First off, for any term or formula E, 
gn(E) > 0. 


Pause. Do you believe this? 


Consider first the subcase where .4 = (Av;).%, and s is substitutable for v; in 
B. By (4), Sub(gn(.Z), i, gn(s)) = gn. @[v; — s]) > 0. 

Now, by assumption of non-substitutability for v; in .4, it must be that 2; 
is free in s. Thus, the relevant condition for the definition of Sub (this is the 
second from last condition, p. 268) fails, since Free(j, gn(s)) = 0. Therefore 
the definition (of Sub) returns 0 (the “otherwise’’), and (7) is correct in this 
subcase. 

The remaining subcase is that substitutability of s failed earlier, that is 
(LH.), Sub(gn(%), i, gn(s)) = 0. Still, the relevant condition in the defini- 
tion of Sub is the second from last. It fails once more, since the conjunct 
“Sub(gn(Z), i, gn(s)) > 0” is false. Therefore, 


Sub(gn(.4), i, gn(s)) > 0 iff the terms is substitutable for v; in.Z 
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We translate the above, as well as (3) and (4), to the formal domain us- 
ing (1) (p. 274) and II.3.8-I1.3.9: 


- Sub(" 4 1, "s')>0 iff the terms is substitutable for v; in.4 
k Sub(t[y;]7,7,°s%) = tls)" 


F Sub(. Alu; ],i,°s ') =" 4[s]" _ if the terms is substitutable for v; in.4 
The claims regarding sub; (p. 268) are trivial consequences of the above. 
We finally check Proof. 


Turning to the lightface version (see p. 271 for the boldface version), let 
n = (gn(.%),.--, gn(-4,_,)), where k > O and.%Z,...,.4,_, is a D'-proof. 
Assume that I(x) (and hence I'(x)) is primitive recursive, an assumption that 
is correct in the case of PA (Exercise II.15) 

That Seqg(n), and [h(n) = k > O, is true outright. To see that Proof (n) is 
true we now only need to investigate (n); fori < k and verify that it satisfies 


LA((n);) V P(n)i) V Aj) <iGim)<iM P((n)j, Mm, (i) V 
(p<aEM(n)j, (i) V A)<iSRW(n);, (i) 
(8) 


Now, (1); = gn(-%,), for i < k. The following exhaust all cases:' 


Case 1. 4, € A UT. Then the first or second disjunct of (8) is true. 

Case 2. Forsome j < iandm <i,onehas.4,, = 4; — @,.ThenM P((n);, 
(n)m, (1); ) is true. 

Case 3. For some j < i, one has 4; = £ — @,~x is not free in %, and 
6, = (Ax).2 — €. Then El ((n);, (n);) is true. 

Case 4. For some j < i, and some variable x and term ¢ substitutable for x in 
4, one has .4, = 4; [x <— ft]. Then SR((n);, (1);) is true. 


Thus, (8) is true under all possible cases. 


Conversely, one can show by (metamathematical) course-of-values induc- 
tion on /h(n) that if Proof (n) is true, then there is a -proof 4%, ...,.4,_,; 
where k =/h(n) > 0,suchthatn = (gn(. 4), ..., gn(-4,_,)). Note that truth of 
Proof (n) guarantees the truth of Seg(n) and lh(n) > 0 (definition of Proof). 
Suppose that k = | (basis). Then the only disjuncts that can be true in (8) are 
the first two (why?). If the first one holds, then this implies that (7)y = gn(7) 
for some formula .# € A (see (LA), p. 270); if the second one holds, then 
(n)o = gn(.#) for some formula .% € I (hence, in either case, the one-member 
sequence “.#” — which we may want to rename “. 4,” —is a ’-proof). The reader 


¥ Recall that we have agreed to allow the redundant substitution rule (p. 270). 
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will have no difficulty completing the induction. By the primitive recursiveness 
of Proof and II.3.9, 


Proof (n)istrue iff + Proof(n) (9) 
In particular, if 4, ...,.“4,_, 1s some T’-proof and we set 


n = (gn. 4), ---, gn-4_4)) 
and also set 
PSHE sO wy Bet) 


— the (formal) Gédel number of the proof — then + t = 7. Since Proof (n) is 
true, (9) yields | Proof (7); hence 


+ Proof (t) 


Now, noting that Proof ™<¢ = Proof, the only-if direction of (9) also holds 
for our & T (by II.4.12). Thus, the claim we made above, “In particular, if 


: By, Seas Zé, | is, etc.”, still holds without the restriction to a primitive recursive 
I’. Is the if direction still true? © 


II.6.2 Definition (The Provability Predicate). The informal abbreviation © 
introduced below is called the provability predicate: 


©(x) stands for (Ay)(Proof(y) A (Si) <inyyX = (Yi) 


II.6.3 Remark. (1) The definition of © and the use of the term “provability 
formula” is modulo a fixed set of nonlogical axioms I’. We have already fixed 
attention on such a set, but our attention may shift to other sets, IP’, I”, etc., on 
occasion. All that is required is to observe that: 


¢ Any such primed I is a consistent extension of PA. 


¢ The corresponding predicate, I’, is X41 over the language of C, Ly, 
where all the symbols (and more) that we use for Gédel coding belong. 


(2) Note that we did not require x to be the last formula in the “proof” y. 


+ We have cheated here and allowed — for the purpose of this informal verification — the “local” 
assumption that TI is primitive recursive. Recall that our “global” assumption on I’ — the one 
operative throughout this section, outside this example — is that it is a 4 formula. In this 
connection the reader should note the concluding sentence of this example. 
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(3) O(®) is in Xy(Lyz,). 


(4) From the work in the previous example we see that @(x) intuitively says 
that x is the Gddel number of some theorem. More precisely, if Ff + .4 (4 is 
over I”’s language) and we set t = ".4' (and also n = gn(.4)), then O(7) is 
a true %14(Lyz,.) sentence. By I1.4.12, 


+ @(7n) 
Hence (by t = 71) 
+ O(t) 


The reader is reminded that “+” means “Fc” unless noted otherwise. 


(5) The preceding discussion completes the (sketch of) proof of 1.9.33 given 
on p. 177. That is, that the lightface version of O(x), ©(x), is semi-recursive. 
A number of issues that were then left open (e.g., the recursiveness of A and 
of the rules J; and Jy) have now been settled. 


(6) It is useful to introduce an informal abbreviation for 
Proof (y) A (AW <inyy* = Wi 
We will use the name Deriv introduced by 
Deriv(y,x) stands for Proof(y) A (Ai) <inyyX = (y)i 


That is, Deriv(y,x) intuitively says that the proof (coded by) y derives the 
theorem (coded by) x. 


In the following lemmata we “do” some of the metatheory of arithmetic 
within formal arithmetic, having arithmetized the language. For example, the 
first lemma says that if we substitute a term into some variable of another term, 
then we obtain a term. The chapter concludes with a proof (second incomplete- 
ness theorem) that we cannot “do” all the metatheory within the theory. 


11.6.4 Lemma. 


- Term(x) = 0A Term(z) = 0 — Term(Sub(x,i,z)) = 0 


Proof. We do this proof in some detail, since in most others in the following 
sequence of lemmata, we delegate much of the burden of proof to the reader. 
(Warning: This proof will be rather pedantic.) 


<4 
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We do (formal) course-of-values induction (II.1.23, p. 214) on x, proceeding 
according to the definition of Term (p. 267). The latter yields (cf. (9), p. 221) 


+ Term(x) = 0 © var(x) V x = (0, S0) 
V Seq (x) A (Ay)<x(h(x) = Sy (1) 
A func((x)o, y) A (VZ)<y Term((x)sz) = 0) 
Now assume 
Term(x)=0 and Term(z)=0 (2) 
and prove 
Term(Sub(x,i,z)) =9 (3) 


We have cases according to (1): 


Case of var(x): Thus (p. 266) F (Aj)<xVar(x, j). We may now add a 
new constant a and the assumption 


a<x JA Var(x,a) 


The subcase a = i and the definition of Sub, (first case; cf. also the definition 
by cases in (15), p. 221) yield 


+ Sub(x,i,z) =z (4) 


Similarly, the (only other) subcase, =a = i, and the definition of Sub (third 
case) yield 


+ Sub(x,i,z) =x (5) 


Either of (4) or (5) yields (3) because of (2), and we are done in this case. 


Pause. In the last subcase I used (tacitly) the “fact” that 
- Var(x,a) — ma =i — 7Var(x,i) 
Is this indeed a fact? (Hint. lh.) 


Case of x = (0,80): The definition of Sub (second case) yields (5), and 
once again we have derived (3) from (2). 


Finally the hard case, that is, that of the third disjunct in (1) above: 


Seq(x)A (Ay)<xh(x) = Sy 


A func((x)o,y) A (W2)<y Term((x)s2) = 0) (he) 
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By (hc) we have 
+ Seq(x) (6) 
and we may also introduce a new constant b and the assumption 


b<xAlh(x) = Sb 
A func((x)o,b) A (Vz) <pTerm((x)s,) = 0 


On the other hand, the definition of Sub (fourth case) and (2) (cf. (15), p. 221) 
yields 


(7) 


+ Sub(x,i,z) = (uw)(Seq(w) A lh(w) = lh(x) A (w)o = (x) A 
(Vy)<ina)(y > 0 — (w)y = Sub((x)y,i,z))) 


Using the abbreviation t = Sub(x,i,z) for convenience, we get from (8) via 
(f) (p. 218) 


(8) 


- Seq(t) (9) 
- h(t) = lh(x) (10) 
F (t)o = (x)o (11) 
and 
F (Vy)<inay(y > 0 > Oy = Sub((x)y,i,z)) (12) 
By (6) and II.2.13 
Fy <Ih(x) > (x)y <x (13) 


Now (7) yields 
F (VZ)<tnexy(z > 0 — Term((x),) = 0) 
Thus (12) and (13) — via the I.H. — yield 
F (Vy)<ma(y > 0 > Term((t)y) = 9) (14) 
that is, reintroducing b (see (7)), 
F (Vy)<pTerm((t)sy) = 0 (14’) 
By (7), (14’), (10), and (11) we now have 


rlh(t) = Sb 
A func((t)o,b) \ (Vy)<oTerm((t)sy) = 0 
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Since + [h(t) < t by IIL.2.4(i), and therefore + b < t by the first conjunct of 
the above, we obtain 


-b<tAlh(t) = Sb 
A func((t)o,b) A (Vy)<oTerm((t)sy) = 0 


Hence, using a new variable z, Ax2, and (9), 


F Seq(t) A Gza:Uh@) = Sz 
A func((t)o,z) A (Vy)<z:Term((t)sy) = 0) 


By the definition of Term, + Term(t) = 0 follows immediately from (15). 


(15) 


& The above result (and the following suite of similar results) is much easier to 
prove if instead of free variables we have numerals. We just invoke IT.3.8-I1.3.9. 
However, we do need the versions with free variables. © 


The following says that a term is substitutable into any variable of another 
term.! 


11.6.5 Corollary. 


- Term(x) = 0A Term(z) = 0 — Sub(x,i,z) > 0 


Proof. Assume the hypothesis (to the left of “—”’). Then (lemma above) 
+ Term(Sub(x,i,z)) = 0 (1) 
Now 
+ Term(x) = 0 — Seq(x) Alh(x) > 0 


by inspection of the cases involved in the definition of Term (see (1) in the 
previous proof). By II.2.13, 


F Seq(x) Alh(x) > 0 — x > (x)o 
Thus, by II.1.5 and II.1.6, 


- Term(x) =0—-x>0 (2) 


By substitution from (1), and (2), we have + Sub(x,i,z) > 0. 


1 By a variable of a term we understand any variable — hence the free i in the corollary statement — 
not only one that actually occurs in the term. That is, we mean a variable x of ¢ in the sense ¢t[x]. 
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11.6.6 Lemma. | Term(x) = 0A Free(i,x) = 0 — Sub(x,i,z) > z. 


Proof. Assume the hypothesis, i.e., 
Term(x) = 0 
and 
Free(i,x) =0 


We do (formal) course-of-values induction on x. By the definition of Free 
(p. 267), we have just two cases to consider. Here is why, and what: 

In principle, we have the three cases as in the proof of II.6.4. However, the 
first case considered in II.6.4 has here only its first subcase tenable, namely, 
where a = 1. This is because the other subcase — a = i — implies as before 
- var(x) A aVar(x,i). Now this forces + Free(i,x) = SO, for if we have 
- var(x), then (x)o = 1; hence we cannot have any of F (Ay)<y func((x)o, 
y),| AF(x), or WFF(x) = 0. 

Thus, (4) of the proof of I.6.4 holds. 


The second case in the proof of II.6.4, namely, x = (0, SO) is also untenable 
as before (see (Fr), p. 267), since here F (x)p = 0. 


This leaves the hard case (hc), which yields (8) of the proof of II.6.4, namely, 
+ Sub(x,i,z) = (y11w) (Seq (w) A Ih(w) = Ihe) A (wo = (0 
A (Wy )<inay(y > 0 > (w)y = Sub((x)y,i,2))) 
Thus, by (f) (p. 218), tautological implication, and eliminating V, 
FO<yAy <Ih(x) — (Sub(x,i,z))y = Sub((x),,i,Z) (1) 
The definition of Free (p. 267) yields 
F (Ay) <imxy@ < y A Free(i, (x)y) = 0) 
Invoking proof by auxiliary constant, we now add 
0<aNa <Ilh(x)A Free(i, (x)q) = 90 (2) 
where a@ is a new constant. By (2) (first two conjuncts) and (1), 
F (Sub(x,i,Z))a = Sub((x)a,i,Z) (3) 
Now, since (cf. (Trm), p. 267) 
- Term(x) = 0 > (V2) <iny)(0 < z — Term((x),) = 0) 


t Since F Ih((x)p) < (x)o and (see (Func), p. 266) F Ih((x)o) > 6. 
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we obtain by hypothesis and specialization 
-F}0<adAa <Ilh(x) — Term((x),) = 90 
Thus 
- Term((x)q) = 0 (4) 


via the first two conjuncts of (2). Using now (2) (last conjunct), (4), and the 
LH. — this uses IT.2.13 — we get 


F Sub((x)q,t, Zz) 2 Z (5) 
Since F Seq(Sub(x, i, z)), II.2.13 yields 


+ Sub(x,i,z) > (Sub(x,i,2Z))a 


By (3) and (5) we now have + Sub(x,i,z) > Z. 


The next three claims (stated without proof, since their proofs are trivial 
variations of the preceding three) take care of atomic formulas. 


11.6.7 Lemma. 


+ AF(x) A Term(z) = 0 — AF (Sub(x,i,z)) = 0 


II.6.8 Corollary. 


+ AF(x) A Term(z) = 0 — Sub(x,i,z) > 0 


11.6.9 Lemma. | AF (x) A Free(i, x) = 0 > Sub(x,i,z)>z. 


For the next result we want to ensure that a substitution actually took place 
(hence the condition on substitutability, “Sub(x,i,z) > 0”, which was unnec- 
essary in the cases of term or atomic formula targets). 


II.6.10 Proposition. 


+ Term(x) =0V WFF(x) =0— 
Sub(x,i,z) > 0A Free(i,x) = 0 — Sub(x,i,z) > z 


Proof. The case for Term(x) = 0 is II.6.6 above. The subcase AF (x) for 
WFF(x) = 0 is II.6.9 (for either of these the assumption Sub(x,i,z) > 0 
is redundant). 
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We consider here one among the other subcases of WF F(x) = 0. Once 
again, we employ course-of-values induction on x, legitimized by II.2.13.i Add 
then the assumptions WF F(x) = 0, Sub(x,i,z) > 0 and Free(i,x) = 0. 


Subcase. x = ("V',(x)s0, X)sso). Thus (definition of WF F, p. 267) 
+ WFF((x)so) = 0 
and 
+ WFF((x)sso) = 0 
By the definition of Free (p. 267) we now have 
+ Free(i, (x)so) = 0 V Free@, (x)sso) = 0 (1) 
By definition of Sub we obtain 


+ Sub(x,i,z) = (" V |, Sub((x)s0,i,z), Sub((x)ss05 45 2)) (2) 


Pause. Wait a minute! The above is so provided that 
F Sub((x)s0,4,2) > 0 A Sub((x)ss0,1,z) > 0 
Is this satisfied? Why? 


By (1), we consider cases. So add Free(i,(x)so) = 0. The LH. and 
+ Sub((x)so,i,z) > 0 yield 


- Sub((x)s0, i, Z) > z 
and hence 
-F Sub(x,i,z) > z 


exactly as in II.6.6, invoking + (...,u,...) > u. The other case works out 
as well. 


II.6.10 justifies the bound of the existential quantifiers in the definition of SR 


(p. 270). & 


II.6.11 Lemma. 
+} WFF(x) = 0A Term(z) = 0A Sub(x, i,z) > 0 
WFF(Sub(x,i,z)) = 0 


T This is the last time we are reminded of the role of II.2.13 in our inductions on Gédel numbers. 
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Proof. We add the assumptions 


WFF(x)=0 
Term(z) = 0 
and 
Sub(x,i,z) > 9 (1) 


and do (formal) course-of-values induction on x, proceeding according to the 
definition of WF'F (p. 267), towards proving 


+ WFF(Sub(x,i,z)) =0 


We illustrate what is involved by considering one case, leaving the rest to the 
reader. 


Case “a”, x = (7, y) andt WFF(y) = 0, where we have used the 
abbreviation y = (x)so for convenience. 
We also add the assumption (which we hope to contradict) 


Sub(y,i,z) =9 
Thus, 
+ aSub(y,i,z) > 0 
and therefore, via tautological implication and Ax4 (since F (x)so = y), 
+ A(WFF(x) = 0A (x*)op = "7'A Sub((x)50, 1,2) > 9) 


Thus, the “otherwise”? in the definition of Sub (p. 268) is provable, since 
(x)o = '-11 is refutable in all the other cases. Therefore, + Sub(x,i,z) = 0, 
contradicting the assumption (1). 

We have just established 


+ Sub(y,i,z) >0 
Hence also (definition of Sub) 
F Sub(x,i,z) = ("', Sub(y,i,z)) (2) 
By LH., 


+ WFF(Sub(y,i,z)) =0 


+ That is, the conjunction of the negations of all the explicit cases. 
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Hence, by the definition of WF F, 
+ WFF(( 7’, Sub(y,i,z))) = 90 
Thus 
+ WFF(Sub(x,i,z)) =0 


by (2) and Ax4. 


Pause. Where was the assumption Term(z) = 0 used? 


IL.6.12 Lemma. 
+ (Vj) Free(j,z) > OA Term(zZ) =0A 
WFF(x) = 0 — Sub(x,i,z) > 90 


© This says that closed terms are always substitutable. 


Proof. We assume 


(Vj)Free(j,z) > 9 


Term(z) = 0 
and 
WFF(x)=0 


and do (formal) course-of-values induction on x proceeding according to the 
definition of WF'F to show 


- Sub(x,i,z) >0 
We illustrate what is involved by considering one case, leaving the remaining 


cases to the reader. 


Case “3”. Addx = (A>, y,w), WF F(w) = 0, and var(y), where we 
have used the abbreviations y = (x)sq and w = (x)sso9 for convenience. The 
latter expands to (see (var), p. 266) 


F (Ay)<yVar(y, J) (3) 


Assume first the interesting subcase. 


Subcase. Add =Var(y,i). By (3) we may add a new constant a and the 
assumption 


a<y/A Var(y,a) (4) 
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By (Vj)Free(j,z) > 0 
+ Free(a,z) >0 
Hence by (4) and Ax2 
F Ap<yWVar(y,j) A Free(j,z) > 0) 
By the LH. + Sub(w,i,z) > 0; hence (definition of Sub) 
+ Sub(x,i,z) = (A>, y, Sub(w, i, z)) 
Thus, + Sub(x,i,z) > 0(e.g., Sub(x,i,z) >"S"). 
Subcase. Add Var(y,i). Now 
- Sub(x,i,z) =x 


Therefore, once more, + Sub(x,i,z) > 0. (Why ist x > 0?) 
1.6.13 Lemma. | Proof(x) A Proof(y) — Proof (x * y). 


Proof. Assume Proof (x) and Proof (y). Thus, 


F Seq(x) Alh(x) > 0A 
(Vi) <in(xy(LA((x)i) V P((x)i) V 
(Aj)<i (Gk) <a MP ((x);, (X)k, &)i) V 
(Ap—aEI((x)j,)i) V 
(Ap<iSR((x);, (&)i)) 
and 
F Seq(y) Alh(y) > OA 
(Vi) <iny (LA(y)i) V Py) V 
(Ajp<a(h<MP(y)j,(e, Oi) V 
(AD<akI(y);,)i) V 
(Ap<aSR(y);, (i) 
By IL.2.21 (p. 247) we have 


- Seq(x * y) 


F Uh(x * y) = lh(x) + lh(y) 
Hence 


F Seq(x *y)Alh(x * y) > 0 


() 


(2) 


(3) 


(4) 
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(4) settles the first two conjuncts of Proof (x * y). We now address the last 
conjunct, (Vi) <in(xxy)(LA(x * y);) V...). It suffices to drop the quantifier 
and show 


Fi <lh(x*y)— 
LA((x * y);) V T(x * y)i)V 
(Aj<i(ak)<iMP((x * y)j,(% * yk, (X * y)i) V (5) 
(Aj) EM * y)j,(@ * y)i) V 
(Ap<aSR(x * y)j,(& * y)) 


We note, again by IT.2.21, 


F (Wi)<my(& * y)i = ()i A 


: (6) 
(Vi) <in(y)(% * Witney = (Wi 


To prove (5), addi < lh(x * y). By (3) and (6) we have two cases. 
Case 1. i < Uh(x). By (1) and (6) (first conjunct) we obtain (using Ax4) 


F LA(x * y)i) V P(x * y)i) V 
(Aj<i(ak)<iMP((x * y)j, (xX * yk, (% * y)i) V 
(Aji EI * y)j,(@ * y)i) V 
(Ap<aSR(x * y)j,( * y)) 


Case 2. i > Ih(x). Setting m = 6@,lh(x)), we get / m+Ih(x) =i 
(cf. II.1.27) andt m < Ih(y) (using (3)). By (2), 


(7) 


F LA(Y)m) V P(Y)m) V 
(Ay) <m(Sk)<mM P((y)j (Wks Wm) V 
(Aj) <mEI((y)j,(Y)m) V 
(Ay )<mSR((y)j (Ym) 


Hence, bringing in (6) (second conjunct) via Ax4 and noting that j < im — 
j <Ih(y), we get 


F LA(x * Y)n+iney) V(x * Yimin) V 
(Ay) <m(Sk) mM P(x * Y)j 41h) & * Y)kttn(x)s & * Y)m-+in(xy) V 
(Apl<mE T(x * y)j41nce)s & * Y)m+ince)) V 
(Ap<mSR(x * Y)j+4ince)s & * Y)m-4incey) 


Translating the above (via proof by cases) in terms of i, we obtain (7) once 
more. 
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To sample one case, we translate the last disjunct. We claim that 


F (Af)emS R(X * Y)j+-ince)s & * Wimtitnie)) 
(Aj<iSR(x * y)j, (x * y)i) 


Assume the hypothesis, and add a new constant a and the assumption 
a < mA SR((X * Y)a+in(x)s (% * Y)m4tncxy) 


By IL.1.17,-} a +Ih(x) < m+ Ih(x), that is, a + lh(x) < i. The above now 
yields 


a+Ih(x) <i A SRUX * Y)atinx), (X * y)i) 
Hence (Ax2) 
F(A/)G <i A SR(x * y)j,(@ * y)i)) 


After all this, the deduction theorem establishes (5). 


II.6.14 Lemma. 


+ Proof (w) A (Aj) <thw (Sk) <inwM P ((w);, (we, X) 
— Proof (w (x)) 


Proof: Assume 
Proof (w) (8) 
and 


(SJ) <thiw (Ak) <thw)M P ((w); , (wk, X) (9) 


and sett = w» (x). Sincet Seq(t) Alh(t) = S(h(w)), the first two required 
conjuncts for Proof (w * (x)) are settled. To conclude, we addi < h(t), that 
is (< 2)i < lh(w) Vi = lh(w), and prove 
F LA((t)i) V T()i) V 
(Aj) <i (Sk) <i MP(O)j, Ox, Oi) V 
Spakl(();,Oi) V 
(Aj<iSR(Oj, O03) 


(10) 


There are two cases: 


Case 1. i < Uh(w). Thus (cf. (6) (first conjunct) of the previous proof) 


F(t) = (wi (11) 
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By (8), 
- LA(w);) V P((w);) V 
(Ayj)<i (Sk) <i MP ((w);, (wx, (w)i) V 
(A)<EI((w);,(w)i) V 
j<iSR(w);, (w);) 
Now (10) follows from the above and (11) via Ax4. 
Case 2. i =Ih(w). Thus,+ (6); = x. By (9), 
F (Af) <tnww) (AK) <inwy MP ((w) ; 5 (w)x5 (Oi) 
or, using (11), 
F (Ap<iGk<iMP((t)j;, Ox, Oi) 


(10) now follows by tautological implication. 


II.6.15 Lemma. 


+ Proof (w) A (Aj) <inwSR(w);,x) > Proof (w* (x)) 


Proof. A trivial rephrasing of the previous proof. 


& I1.6.16 Exercise. Show by a formal course-of-values induction on i that 
Fi <Ih(x) A Proof (x) — WFF((x);) = 90 


Moreover, show that this is so regardless of whether one adopts MP, ET, and 
SR as we did on p. 270 or, instead, one adopts their primed versions (cf. II.5.1, 


p. 271). & 


11.6.17 Corollary. 


+ Deriv(y,u) A Deriv(z, (" > ',u,x)) > Deriv(y *z * (x), x) 


Note. We have used the abbreviation “(" > 7,u,x)” for “(" V7, ("7 ,u), 
x)? 


Proof. By Lemma II.6.13, + Proof (y * z). The result follows at once from 
11.6.14 since MP(u, (" > 1,u,x),x). 

This last claim also uses II.6.16, since the hypothesis (to the left of “>” 
yields WFF(u) = Oand+ WFF(( — ',u,x)) = 0. 
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& A special case of the above is: For any formulas .4 and .%, 
F Deriv(y,".4')A Deriv(z,".4 — #2) 
— Derivly *z*(.2"),". 2) 
II.6.18 Corollary. 


- Term(z)=0A Sub(x,i,z) > 0 — 
Deriv(y,x) > Deriv(y * (Sub(x,i,z)), Sub(x,i,z)) 


Proof. We add Term(z) = 0, Sub(x,i,z) > 0, and Deriv(y, x). That 
+ Deriv(y * (Sub(x,i,z)), Sub(x,i,z)) (12) 
will then follow at once from II.6.15 once we prove 
-F SR(x, Sub(x,i,z)) (13) 


We look first at the interesting case: That is, we add Free(i, x) = 0. 
To prove (13) we need (see definition of SR, p. 270) to prove 


+ WFF(x) =0A WFF(Sub(x,i,z)) =0A 
(Si )<x(AW)<subw,i,z)(Term(w) = 0A (14) 
Sub(x,i,z) = Sub(x,i, w)) 


or, simply, 


LWFF(x) = 0A WFF(Sub(x,i,z)) = 0A 
i<xAz< Sub(x,i,z) 


because then the third conjunct of (14) follows by MP and Ax2 from 


Fi<xAz < Sub(x,i,z) A Term(z) =0A 
Sub(x,i,zZ) = Sub(x,i,z) 


That is, we need 


+ WFF(x)=0 (i) 
+ WFF(Sub(x,i,z)) =0 (ii) 
Fi<x (iii) 


Ez < Sub(x,i,z) (iv) 
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We get (i) by | Deriv(y,x) (II.6.16), while (i) follows from II.6.11, (i), 
and the assumptions. We get (iii) from the definition of Free. Finally, (iv) is 
by II.6.10. 


The other case is to add Free(i,x) > 0. Then + Sub(x,i,z) = x (see Ex- 
ercise II.19); hence we need only show that 


+ Deriv(y,x) — Deriv(y * (x),x) 


We leave this as an easy exercise (Exercise II.20). 


er We can also state a special case: For any formula .%, variable v; and term t 
substitutable for v; in 7, 


- Deriv(y,”.4') > Deriv(y * (sub;(4),'t)), subs(4), tt) © 


Translating the last two corollaries in terms of © we obtain 


11.6.19 Corollary. 
F Ou) AOK — T,u,x)) — O@) 


1I.6.20 Corollary. 
- Term(z) = 0A Sub(x,i,z) > 0 — O(x) — O(Sub(x, i, z)) 
oe Remark. (1) A special case of Corollary II.6.19 is worth stating: For 
any formulas .4 and .% over Ly, 
FOC.4)A 00.4.2) OF.7') 


(2) We omit the rather trivial (“trivial” given the tools we already have devel- 
oped, that is) proofs of II.6.19-I1.6.20. Suffice it to observe that, for example, 
using Ax2 and Fqaut, we get from II.6.17 


+ Deriv(y,u) A Deriv(z,(" > ',u,x)) > O() 


Using 4-introduction, we can now get rid of y and z. © 


In order to focus the mind, we now state the derivability conditions that hold 
for © (Ableitbarkeitsforderungen of Hilbert and Bernays (1968)). We will need 
to establish that they indeed do hold in order to meet our goal of this section. 
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II.6.22 Definition (Derivability Conditions). The following statements are 
the derivability conditions for the derivability predicate © appropriate for T’. 
For any formulas .4 and .% over Ly,i 


DC 1. If; .4, thenkc O(.4)). 

DC 2. +c OF .4)/A OC .4 — 2) OF .F’). 

DC 3. For any primitive recursive term¢ over Ly. and any numbers a1, ..., dn 
inN, te t(@,...,d;) =0 — OC t(G,...,q) = 90). 


II.6.23 Remark. DC 3 above is pretty much the Hilbert-Bernays (1968) third 
derivability condition. DC1—DC2 are actually Léb’s versions. Lob also has a 
different third derivability condition — a formalized version of DC 1 — that we 
prove later (see II.6.38). 


Now, DC 1 was settled in II.6.3(4), while DC 2 is I.6.21(1). Thus we focus 
our effort on proving DC 3. To this end, we will find it convenient to prove a 
bit more: namely, that DC 3 is true even if instead of the a}, ... , dG we use free 
variables x1, ...,X». To formulate such a version we will employ, for any term 
or formula E(x1,...,Xn),' a primitive recursive term gg(X1,..+5Xn) such that, 
for all a),...,a, inN, 


Fe "EG, ..., Gn)! = ge(ai,---, an) () 


Assume for a moment that we have obtained such a family of g-terms, and let 
21:=0(%1,---5Xn) be appropriate for the formula t(x1,...,Xn) = 0. If now we 
manage to prove 


F t(X1,---+5Xn) = 0 > O(g:=0(%1, --- Xn) 
then DC 3 follows by substitution and Ax4 from (1) above. 
To this end, we first address an important special case: We introduce a 
function symbol Num such that 
+ Num(n) ="n! for alln ¢ N (2) 
Thus, we are saying that Num = g,, that is, a g-term appropriate for the term x. 


ir symbol Num — and its defining axioms below — are in Ly, and C re- 
spectively, since the introducing axioms make it clear that Num is primitive 


recursive. er 


¥ In the interest of clarity — and emphasis — we have retained the subscript C of +, wherever it was 
applicable throughout the definition. 
¥ Recall the convention on round brackets, p. 18. 
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We define 
Num(0)='0O" 


(Num) 
Num(Sx) = ("S", Num(x)) 


To see that the above definition behaves as required by (2), we do meta- 
mathematical induction on n € N: For n = 0, the claim is settled by the first 
equation in the group (Num) and by F 0 = 0 (definition of “7”, p. 171). 


Assume the claim for a fixed n. Now (definition), F n+l = Sn; thus (Ax4 
and second equation in (Num)) 
+ Num(n + 1) = ("S7, Num(i)) 
FE ("S1, Num(n)) = (S1,°7") by LH. and Ax4 
EUS na) ="Sn? by definition of “St” 
L°Sni7="n+1" __ by definition of 7 
® Intuitively, Num(x) is the Gddel number of 

S...S0 (3) 
amon aan 
xX copies 


where x is a formal variable (not just a closed term 71). If we are right in this 
assessment, then it must be case that / Term(Num(x)) = 0. Moreover, even 
though the expression in (3),' intuitively, “depends” on x, it still contains no 
variables (it is a sequence of copies of the symbol “S” followed by “0”’), so that 
we expect Free(y, Num(x)) > 0. 


Both expectations are well founded. 
1I.6.24 Lemma. | Term(Num(x)) = 0. 


Proof. Formal induction on x. For x = 0, we wantk Term(' 0") = 0. This is 
the case, by II.3.8, since Term(gn(0)) = 0 is true. 


We now prove 
- Term(Num(Sx)) = 0 (4) 
based on the obvious I.H. It suffices (by (Num) above) to prove 
+ Term(("S", Num(x))) = 0 


1 of course, such an expression is not well formed in our formalism, because it is a variable length 
term. This is why we emphasized the word “intuitively” twice in this comment. 
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Turning to the definition of Term (p. 267), we find that the third disjunct is 
provable, since 


L Seq (("S7, Num(x))) AIh((" $7, Num(x))) = SSO 
A func((("S", Num(x)))o, SO) 
A Term(((S", Num(x)))so) = 0) ; the last conjunct by I.H. 


We are done, by definition by cases ((15), p. 221). 


& II.6.24 says that Num(x) “behaves” like the Gédel number of a term. It does 
not say that there is a term ¢ such that Num(x) = 't'. (See also the previous 
footnote, and also recall that Gddel numbers of specific expressions are closed 


terms.) & 


1.6.25 Lemma. | Free(y, Num(x)) > 0. 


Proof. We do (formal) induction on x. Forx = 0 we wantl Free(y,'0") > 0, 
which is correct (if we examine the cases in the definition of Free, p. 267, only 
the “otherwise” applies). 


To conclude, we examine Free(y, Num(Sx)) (with frozen free variables x 
and y). By IL.6.24, 


/ Term(Num(Sx)) = 0 


so we need only examine (see (Fr), p. 267) 
(Az)<sso(0 < z A Free(y, (("S', Num(x))).) = 0) (5) 


We want to refute (5), so we add it as an assumption (with frozen free variables 
x and y — proof by contradiction, 1.4.21). 
We take a new constant, a, and assume 


a < SSOA0 <a Free(y,(("S", Num(x)))a) = 0 
One now gets + a = SO from the first two conjuncts;* hence 
+ Free(y, Num(x)) = 0 


from the last conjunct. 

We have just contradicted the ILH. Free(y, Num(x)) > 0. Thus, the nega- 
tion of (5) is provable. It now follows that the “otherwise” case (conjunc- 
tion of the negations of the explicit cases) is provable in the definition of 


¥ The first yields a < SO, and the second yields $0 < a. 
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Free(y, Num(Sx)). Therefore, + Free(y, Num(Sx)) = SO (see also (15), 
p. 221). 


It is now clear what the gz-terms must be: 

Let E(v;,,..-,U;,) be a term or formula where the v;, are (names for) the 
formal object variables, as these were constructed on p. 166 from the symbols 
of a finite alphabet. Let us introduce the informal abbreviation 


s(x, y) = sub;(x, Num(y)) (s) 


Informal, because we do not need to introduce a formal symbol. All we need 
@ is convenience. Were we to introduce s; formally, it would then be a primi- 

tive recursive function symbol of Ly, and (s) would be the defining axiom 

(essentially, composition). As it stands now, it is just a metatheoretical symbol. S 


Then gr(x1,---,X,) — where the x; are arbitrary metavariables, possibly 
overlapping with the v;, — is an informal abbreviation for the term! 
5;,(. «Si, (8;,( E |, X41), X2)5 +++ 5 Xn) (6) 
Indeed Ax4, (2) (p. 296), and (s) above imply, for any n € N, 
Ke si,(x,n) = subj, (x,') 
Thus (p. 279) 
Fe i, Elvu;,)",n) =" Ela) 
Repeating the above for each free variable of FE’, we obtain 
Fe ge(m,...,Mn) ="E(m,..., mn)" 
It is clear that gg is a primitive recursive term (each sub; (x, y) is). 


We abbreviate (6) — ie., ge(41,.-- Xn) — by yet another informal notation as 
in Hilbert and Bernays (1968): 


{E(x1,.--,Xn)} (7) 


The reader will exercise care not to confuse the syntactic (meta)variables x; 
introduced in (6) and (7) with the formal variables that occur free in the ex- 
pression EF. The latter are the v;,, as already noted when introducing (6). In 
particular, the in sub; is that of the formal v;, not that of the “syntactic” x; 4 


ser) 
U 


i By I1.6.25, the order of substitutions is irrelevant. “E” and “E(vj,,..., Vi,)” are, of course, 
interchangeable. The latter simply indicates explicitly the list of all free variables of E. 

+ Unless, totally coincidentally, x; denotes v; in some context. See the informal definition (s) 
above, and also the definition (sub;), p. 268. 
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We may write simply {£} if the variables x1,...,x, are implied by the 
context. 


We note that while " E(x1,...,X,)' is a closed term, {E(x1,...,Xn)} isa 
term with precisely x1,...,X, as its list of free variables. & 


ore Example. | {x} = Num(x). Indeed, 
F {x} = subm( Um ', Num(x)) 
But F sub, ("Um |, Num(x)) = Num(x) (by the definition of Sub, first case, 
p. 268). For any terms 
L(V jp 6 6 5 Uj, Uys ++ +9 Um, ) 
and 
S(Uj,y ++ +5 Vjzy Vins ++ +5 UK) 
and any choice of (meta)variables 
X15 000 Xn Viy 20% 5 Vn Zlye 0+ 9 Sr (1) 
We have 
F {(t = S)(*1, eee Xho V1y e+e 9 Vang Z1ye0% »Zr)} = 
co = q {t(x1, see XhoZ1yees Ze) {s(x1, see g Xho V1yees »Yn)}) 
Now that the list (1) has been recorded, we will feel free to abbreviate (2) as 
F {t=s}=(" = 7, {t}, {s}) (3) 
We proceed to establish (2) (or (3)) recalling that the order of substitution 


is irrelevant, and using “=” conjunctionally below (i.e.,F/ t =s =r means 
-K}t=sand'+s =r): 


(2) 


FE {(t = s)(X1, 6-6 Xn V15 e+e Vay Zoe rh = 
Sh, (6 + Sing (eo 8, (eo Sj CE SH SX) yo Kh )y eee Vado eee Sr) = 
(HY S6. (66 Sin (6 6 Spee SpE Xd) y oe Xn )y eee Vay ee ey Sr)s 
Sk (e+ Sing (eo Sj, (6 Sf C8 Xd) ye ee Xn )y eee Mayes ey Zr) = 
(1 {EG 6 Hy Zaye eS {SCCIy + + + ay Yty + + + Ind) 
The computations above iterate case 4 of the definition of Sub (subcase for 


AF, p. 268), keeping in mind informal definition (s) (p. 299). This iterative 
calculation relies on II.6.4 and II.6.24 to ensure that 


+ Term(Sub(" E7,i, Num(w))) = 0 


at every step (E ist ors). 
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We have abbreviated 
"(b= S)(Uj, 5+ ++ 5 Vj, Vmyy ++ +9 Uys Vigg +++ UK)! by Tt=s' 
"L(V jy 5 ++ +5 VjzyUmyy e229 Um,)| by TE" 
and 
S(Vj.5 +++ 9 Ujzs Vigse++5 UK) by TS! 


We have also freely used during our computation the fact that 
+ Free(i,x) > 0A Term(x) = 0 — Sub(x,i,z) =x 


(see Exercise IT.19). 


The same type of computation — employing II.6.11, 1.6.12, 1.6.24, and 
II.6.25 — establishes that 


HLA Y= WL ABLAY) 
We have used the short { }-notation above. 


The behaviour of the symbol { } is not very stable with respect to substitution. 
For example, we have found that + {x} = Num(x) above. However, it is not 
true that for arbitrary unary f 


+ {fx} = Num(fx) (4) 
(see Exercise II.21). On the other hand, (4) does hold if f = S. Indeed (using, 
once again, “=” conjunctionally), we find that 


t {Sx} = sub,(" Sv; ', Num(x)) 
= (S),sub,(" v1 ', Num(x))) 
= ("S', Num(x)) 
= Num(Sx) 


1.6.27 Lemma. Fc O(x) — O(Sub(x,i, Num(y))). 


Proof: By modus ponens from II.6.20, since 
+ Term(Num(y)) = 0 
by I1.6.24, and 
F O(x) — Sub(x,i, Num(y)) > 0 


— the latter from + O(x) — WFF(x) = 0 (cf. II.6.16), 1.6.12, and II.6.25. 
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11.6.28 Corollary. For any formula 4 over Ly and variable v;, 
be OC. Auiy) > © (Sub (Awl, i, Num(x))) 


Iterating application of the above for all the free variables in. 4, we obtain 
at once: 


11.6.29 Corollary. For any formula .% over Ly, 
Ke OO Av, ; rr) vi,) ') = O({. 401, eee »Xn)t) 


Of course, we also obtain a useful special case — where x; denotes v;,; — as 
follows: For any formula .4 over Ly, 


Fe OC. Alxq, «++ Xn) !) > O(L. Aa, --- Xn) }) ® 


The next corollary follows at once from the above remark and DC 1. 


11.6.30 Corollary (The Free Variables Version of DC 1). For any formula.~4 
over Ly, ify Axi, +++ ,Xn), thence O({. Ary, .- +5 Xn)}). 


11.6.31 Lemma (The Free Variables Version of DC 2). For any formulas .4 
and .% over Ly, 


Proof. {.4— #} abbreviates 


("3 7, sub;,(...sub;,(4', Num(x,)),..., Num(xXp)), 
sub;,(...sub;,(".2 ', Num(x1)),..., Num(xn))) 


while {4} and {.7} abbreviate 

sub;,(...sub;,((.4', Num(x1)),..., Num(xn)) 
and 

sub;,(...sub;,(.2 |, Num(x)),..., Num(xn)) 


respectively. We are done, by II.6.19. 


Pause. Why is 


+ WFF(sub;,(...sub;,(" 21, Num(x1)),..., Num(x,))) = 0? 
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1I.6.32 Remark. In what follows, all the way to the end of the section, we as- 
© sume — without loss of generality — that our unspecified consistent 44 extension, 
I’, of PA also extends C, the theory of Gédel coding. 
Here is why generality is not lost in the case where we were unfortunate 
enough to start with a I that did not satisfy the assumption: 


(1) We start with PA < [,* where T is 44 (Lm7,). 

(2) We extend TI — to I’ — by adding the axioms for all the defined symbols 
that were added to PA in order to form C and Ly,. This results to the 
language Ly. So, not only do we have PA <C conservatively, but also 
I’ <I” conservatively. 

(3) Since I is consistent, the “conservatively” part above ensures that so is I’. 
Trivially, PA < P’andC <I". 

(4) If @ is a formula that semantically defines the nonlogical axioms of C, 
that is, Q(°.4") means that .4 is a C-axiom, then I’’(x) can be taken to 
abbreviate 


T(x) V O() (x) 


Given that Q(x) is primitive recursive (Exercise II.23!), the formula (x) is 
X4(Lyz,.) (Exercise II.24). 


1.6.33 Lemma. For any terms t,s and variable z, if -&e t = s and 
tet =z — O({t = z}) 

then also 
tes =z— O({s = z}) 


where we have written “©” for “Or”. 


Proof. “” means “Fc”. By Ax4 
PLoS eS6 SZ (1) 
and 


Fs=z—-t=Z (2) 


+ The relation “<” that compares two theories was introduced on p. 46. 

 C contains the finitely many defining axioms we have introduced towards the Gédel coding. 
Moreover it contains the infinitely many axioms introducing the primitive recursive symbols. A 
defining predicate for this part can be introduced by a course-of-values recursion, as can be easily 
deduced from the specific allocation scheme chosen for primitive recursive function symbols. 
See Remark II.3.7, p. 254. 
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By DC 1 and DC 2 (free variables versions, II.6.30 and II.6.31) in that order, 
(1) yields 

+ O({t =z}) > O({s = z}) (3) 


where we have used the blanket assumption “C < I” (II.6.32) in the application 
of DC 1 (DC 1 assumes “Fry ...”). Our unused assumption, along with (2) 
and (3), yields what we want by tautological implication. 


We now have all the tools we need towards proving DC 3. 


11.6.34 Main Lemma. For any primitive recursive term t over Ly. and any 
variable z not free int, 


tet =z — O({t =z}) (1) 
where we have written “©” for “Or”. 


We continue the practice of omitting the subscript C from | (a subscript will 
be used if other than C, or as a periodic reminder (or emphasis) if it is C). The 
implicit assumption ““C < T” (1.6.32) enables II.6.30 (DC 1, free variables 
version) exactly as in the case of II.6.33. 


Here is how not to prove the lemma: “By Ax4 it suffices to prove 
+ O({t =t}) 


This follows from ¢ = ¢ and II.6.30.” 


Such an attempt does not heed the warning in Example II.6.26. Recall that 
F {t=z} = (" = 1, {t}, Num(z)). However, we have already noted that it 
is not true in general that + Num(t) = {t}. Thus, in general, 


¥ {t=z}k —t] = {t =¢} © 


Proof. The proof is by (metamathematical) induction on the formation of t and 
follows the one given by Hilbert and Bernays (1968) (however, unlike them, 
we do not restrict primitive recursive terms to “normalized” forms). 


We have three basis cases: 
Case 1. t = Zvo. Since + Zuo = 9, it suffices, by II.6.33, to prove the 


following version of (1): 


+ 0=z— O({0 = z}) 
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By Ax4 it suffices to prove 
+ @({0 = 0}) 
This follows from + 0 = 0 and I1.6.30. 


Pause. Wait a minute! How does this differ from what I said above that we 
should not do? 


Case2. t =U"(vo,---5Un—1)- 
Again, by II.6.33 andF U? (vo, .- +5 Un—1) = Vi-1, (1) now becomes (where 
we have simplified the metanotation: x rather than the “actual” v;_1) 


bx=z— O({x =z}) 
By Ax4 it suffices to prove 
+ O({x = x}) 
This follows from + x = x and II.6.30. 


Pause. Was this O.K.? 


Case 3. t = Sx. (1) now is 
F Sx =z O((" = 7, {Sx}, Num(z))) 
By Ax4 it suffices to prove 
+ O((" = 7, {Sx}, Num(Sx))) 
— that is (11.6.26), @((" = 7, {Sx}, {Sx})), or (1.6.26 again) 
+ O({Sx = Sx}) 
This follows from Sx = Sx and IL.6.30. 


We now embark on the two induction steps: 


Composition. Suppose that t = fs,...s,, where the function f and the 
terms s; are primitive recursive. 
Let (1.H.) 


P fxs dy Sz > OC farsa =Zz}) (2) 
and 


+s; =x; > O({s; = x;}) fori=l,...,n (3) 
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where none of the x; are free in any of the s;, and, moreover, z is not one of the 
x;. By Ax4 


F $y = xXy > Sg = X2 + Sy = Xn 


Sfs1---Sn = Z— fx... Xn =Z 
and 
F $y =X] 9 82 = XQ + SS, = XO & 
fx1-- Xn = ZO f81.--5n =Z% 
By (2) and (4) (and tautological implication), 
Fr Sy =X Sz = XQ SS =X 
S91--68n =Z — O({ fxr... Xn = ZH) (©) 
By II.6.30 and (5), 
b O({s1 = x1 SF 52 =X 9 +++ SS, =X, 
fx1---Xp =Z— f$1---Sn =zZ}) 
Hence, by II.6.31, 
+ O({s1 = x1}) > O({s2 = x2}) 3 +--+ > O({s, = xn }) > 
O({fx1..-%n = Z}) > O({ far... 5, = z}) 
The above and (3) tautologically imply 
F §y =X, 82 = XQ. Pe Sy = XQ 
O({fx1..-Xn = z}) > O({ far... 5n = z}) 
The above and (6) tautologically imply 
F §y =X, 82 = XQ OS = XQ (7) 


Sf 51 ++ -Sn =Z— O({ fs... - Sn = zh) 


Finally, since the x; are not free in any of the s; and are distinct from z, the 
substitutions [x; «— s;] into (7), and tautological implication, imply 


F fs1...8, =z — O({fs1...5n = z}) 


Primitive recursion. We are given that h(y) and g(x, y, w) are primitive 
recursive terms, and that f has been introduced to satisfy 


F f0,¥)=hy) 
F Sf (Sx, y¥) = g(x, ¥, f(x, ¥)) 


Supposing that z is distinct from x, ¥, we want to show 


F f(x,¥) =z — O({ f(, ¥) = z}) 


(8) 
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To allow the flexibility of splitting the induction step into an LH. and a con- 
clusion (as usual practice dictates, using the deduction theorem), we prove the 
following (provably) equivalent form instead: 


F (v2)( fx, F) =z > O({F@,5) = z})) (9) 


under the same restriction, that the bound variable z is not among x, y. 
Now, our metamathematical I.H. (on the formation of primitive recursive 
terms ¢) — under the assumption that z is not among x, y, w—is 


F M¥) =z — O({h(F) = z}) 


10 
F g(x, ¥, w) =z > O({g(x, ¥, w) = z}) oa 


(9) is proved by formal induction on x. 


For the formal basis of (9), let us set x <— 0. By the first of (8), the first 
of (10), and II.6.33 (followed by generalization) we deduce the basis, namely, 


F (vz)( f0,5) =z > O({ 0,5) =2})) 
Now assume (9) (formal I.H.) for frozen x, y, and prove 
IF (v2)( f(Sx, 5) =z > O({f(Sx, 5) =z})) (Ul) 
By generalization, it suffices to prove 
F f(Sx,¥) =z > O({ f(Sx, 9) = z}) (11’) 
(where z is not among x, ¥). We choose w, not among x, ¥,z. By Ax4, 


F f@,¥) = w— gx, ¥, f@,y)) =z > gx, ¥,w) =z 
F fy) = w— g(x, ¥,w) =z > 8, ¥, f(x, ¥)) =z 
which by the second part of (8) and Ax4 translate into 


F f(x,y) =w— f(Sx,¥) =z > gx, Y,w) =z 


. is (12) 
- f@%,¥)=w- ga, y,w) =z > f(Sx,¥) =z 


Tautological implication using the second part of (10) and the first of (12) 
yields 
b f(x, 9) = w— f(Sx,¥) =z > O({g(x,F,w) =z}) 13) 
II.6.30—-II1.6.31 and the second part of (12) yield 


+ O({ f(x, ¥) = w}) > O({g(, ¥, w) = z}) > O({ f(Sx, ¥) = z}) 
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Using the I.H. (9), followed by specialization (and with the help of Fqaut), this 
yields 


b f(x, ¥) = w— O({g(x, ¥, w) = z}) — O({ f(Sx, ¥) = z}) 


Here is where the (universally quantified) form of (9) helped formally. We have 
been manipulating the z-variable by substituting w. This would be enough 
to unfreeze variables frozen between I.H. and conclusion, thus invalidating the 
deduction theorem step. However, that is not the case here, because this variable 
manipulation is hidden from the deduction theorem. z is a bound variable in the 


assumption (9). © 
Tautological implication, using the above and (13), furnishes 
F f@,y)=w— f(Sx,¥)=z-> O({ (Sx, ¥) = z}) 


Finally, since x, w,z, y are all distinct, the substitution [w<— f(x, y)] in the 
above yields (11’) by tautological implication, and we thus have (11) by gener- 


alization. 


At this point the reader may take a well-deserved break. 
Next, we prove 


11.6.35 Corollary. For any primitive recursive predicate P over Ly ,, 


Feo P(%q,--+5Xn) O({P(x1, iad »Xn)t) 


Proof. Indeed, 

F P(x1,.+-5Xn) > XP(X1,---,Xn) = 0 (1) 
and 

F XpP(X1,---5Xn) =O Px,...,Xn) (2) 


where xp is “the” (primitive recursive) characteristic function (see II.3.6, 
p. 253, and II.1.25, p. 220) for P(x1,...,Xn). By the main lemma (II.6.34), 


- XP(xX1, eee Xn) =0—- O({xp1, cee »Xn) = 0}) 
Hence (by (1)) 


+ P(x1,+++5Xn) > O({xp(1,- ++ 5Xn) = OF) (3) 
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By (2) and II.6.30-1.6.31, 


+ O({xp(x1,--+,Xn) = OF) > O({P(a1,...,Xn)}) 


This and (3) yield what we want, by tautological implication. 


We can prove a bit more: 


11.6.36 Corollary. For any %4(Ly;,.) formula 4, 
Fe .4(X1,-++5Xn) O({.4 (X1,.-- »Xn)t) 
Proof. Let.4 (%1,..-,Xn) be 44(Lm,). 


Then, for some Ao(Zm,.) formula 7(y,%1,-..,Xn), where y is distinct 
from X1,.--5Xn, 


FG (X15-++5Xn) AYO, XI, + 5 Xn) 
We introduce a predicate R by the definition 
R(y, Xn) > C(y, Xn) (4) 


Then R is primitive recursive (Exercise II.22). 


By IL6.35, 
b R(y, X15 +++ 5 Xn) + O({RWV, x15 +++ 5X0) }) 
Hence, by (4) and tautological implication, 
+ O(Y X15 +++5Xn) > O({RY,X1,- ++ Xn) F) (5) 
By II.6.30-11.6.31 and the — -part of (4), 
F O({RUY, x1, +++ Xn) }) > O({C(y, x1, -- + Xn) }) 
Hence (by (5)) 
+ O(y,X1,+++5Xn) + O({A(y, x1, --+,Xn)}) (6) 
Now, 
AY X15 ++ 5 Xn) > (AYO(V, X1, ++ + Xn) 
Hence, by II.6.30-I1.6.31, 


r O({2, x1, tee »Xn)t) mas E({EyNZ, x1, eee »Xn)t) 
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Combining the above with (6) via Ftaut, we get 


F O(y,X1,+++5Xn) e({EGy2U,x1, hits ea) 


which yields what we want via 5-introduction. 


The following corollary is a step backward in terms of degree of generality, 
but it is all we really need. It is a formalized DC 1, for it says “if it is true that 
_@ is provable, then it is also true that O(". 4") is provable”. 


11.6.37 Corollary (L6éb). For any formula .4 over Ly ,, 


ke O(-4) > O(rO(.4)) 


Proof. For any fixed 4, O(°.4') is a 44 (Lm,) sentence. 


II.6.38 Remark. The above corollary is Lob’s third derivability condition, 


DC 3. © 


1I.6.39 Theorem (The Fixpoint Theorem, or Diagonalization Lemma). For 
any formula A(x) over Ly, there is a sentence .@ over the basic language of 
PA, Lx, such that 


ke BoA PB) 


# is called a fixed point or fixpoint of .4 in C. 
The assumption C < I of II.6.32 is not used here. cr 


Proof. Let @(x) be the formula obtained from .4(s1(x, x)) after removing all 
defined symbols (following I.7.1 and I.7.3). Then @ is over Lm, and 

Fe (x) 3 Asi (x,x)) (1) 
Let next n € N be such that (cf. (1), p. 274) 

Fen ="@(X)' 
Hence (p. 279, 297, and 299) 
Fe s(n, n) =" A(n)" (2) 

By (1) and substitution, 


Fe (n) & A(s1(N, n)) 


1.6. Derivability Conditions; Fixed Points 311 


By Axé4 and (2), the above yields 
ke Bn) 3 A A(n)’) 


Thus “@(n)” is the sentence “.7 ” we want. 


Applying the above to —@r, we obtain a sentence that semantically says “I 
am not a theorem of I”. 


11.6.40 Corollary (Gédel). There is a sentence ¥ over the basic language Ly 
such that 


ro F > Or" 7) (1) 


We now need to revisit (half of) Gédel’s first incompleteness theorem. We 
will show that the sentence ‘¥ above is not provable in I’. For the balance of 
the section we will be careful to subscript © with the name, say I, of the theory 
for which it is a provability predicate. 


II.6.41 Lemma (Godel). Let over Ly be a consistent extension of PA such 
that T(x) is 44(Ly;,.). Let the sentence F over Ly be a fixed point of sO y in 
C. Then Yr GF. 


& The assumption C < I of II.6.32 is not used here. 


Proof. Assume that 
bp GF (2) 
By the assumption on I(x), DC 1 is applicable; hence 
re Or") 

(1) of II.6.40 now yields 

bo nG 
Since PA < C conservatively, and ¥ is over Lyx, we have Fp, —Y, and hence, 
by PA <T, 

Fp ny 


The above and (2) contradict the consistency of I’. 


+ The other half, which we do not need towards the proof of the second incompleteness theorem, 
uses a stronger consistency assumption — w-consistency (p. 189) — and states that ¥ is not refut- 
able either. 
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We continue honouring the assumption of II.6.32; however, we make it explicit 
here. That is, we have the following situation: 


(1) PA <I consistently (I in 44(Z,)). 
(2) T <I” conservatively, by adding the symbols (and their definitions) of C. 
I” is in X4(Lm,) 


We let the metasymbol “Cony” stand for “T’ is consistent”. With the help of the 
arithmetization tools this can be implemented as a sentence whose semantics 
is that some convenient refutable formula is not a [’-theorem. 

“0 = SO” is our choice of a refutable formula (cf. $1). We may now define 


Cony abbreviates 7AOp("0 = S0') 


since 0 = SO is unprovable in I’ iff it is so in I’; therefore 5Op ("0 = SO") - 
that is, Conr — says both “I” is consistent” and “TI is consistent”. 

We will next want to show that, under some reasonable assumptions, T° 
cannot prove Conr, that is, it cannot prove its own consistency. 

For this task to be meaningful (and “fair” to I), Conr (that is, the actual 
formula it stands for) must be in the language of I, Ly. We can guarantee 
this if we define Cony to instead abbreviate the sentence .” obtained from 
30 (0 = SO") by elimination of defined symbols. Thus, we finalize the def- 
inition as 

Cony abbreviates (*) 


where .” is over the basic language Ly of PA and satisfies 


ke Y -& 7AOr (0 = SO’) (kk) 7 


1.6.42 Theorem (Gédel’s Second Incompleteness Theorem). Let I" be a 
consistent extension of PA such that I(x) is 44(Ly,.). Then l/r Conr. 


Proof. Let T < IT’ as in the discussion above (we continue being explicit here 
about our blanket assumption IT.6.32; thus C < I’, while C < I might fail). 

The proof utilizes the fixed point ‘Y of ~@y- that is obtained in the manner 
of I1.6.40-II.6.41 above (note however the prime). Our aim is to show that? 


Fy Conp > & (3) 
We start with the observation 


Kraut 7% > (¥ + 0 = $0) 


+ Cf. the informal discussion at the beginning of this chapter, p. 205. 
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Hence (absolutely) 
bung (F —0= S0) 


which by (1) of II.6.40 (but using the theory I’ rather than I’) and the Leibniz 
rule implies 


Foe Op ( F") + (¥ — 0= SO) 
DC 1 now yields (this uses C < I’) 
be Or (“Or (7) + (F + 0= $0") 
which further yields, via DC 2, 
Fe Or (Or("Z))) + Or(-F) = Or ("= 80") 
Léb’s DC 3 (II.6.37, which also uses C < I’) and the above yield, by tauto- 
logical implication, 
Fe Op F) > Or (0 = SO") 
By contraposition, and using (1) of II.6.40 (for I’) and («), 
be SAF (4) 


Since C < I’, (4) implies (3) (via («)). 
Now if Fp Cony, then also Fp Conr by Tl < I’. Thus (3) would yield 
Fy ¥, contradicting II.6.41 (for T’’). 


(1) The above proof is clearly also valid for the trivial extension [ = PA. In 
this case, '’ = C. Moreover, we note that, since PA < C conservatively 
and. Y — F is over Lm, (4) implies 

Epa SF _ GF 


as well, from which (via PA < P —no prime) Fr .Y% — &. 


(2) It is also true that Fr Y — Cony; thus Fr Y < Conr (Exercise II.25). } 


II.6.41 and 1.6.42 provide two examples of sentences unprovable by I’, a 
consistent extension of PA with a ) set of axioms. As remarked above, the 
two sentences are provably equivalent (in I"). The former says “I am not a 
theorem”,’ and thus its unprovability shows that it is true in 2(, the natural 
structure appropriate for Ly. 


+ Of I’ — being a fixed point of “Oy — and hence not of TF either. 
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© Let us put II.6.32 back into (implicit) force to avoid verbosity in the discussion 
and results that follow. So we have C <T. & 


What about a sentence that says “I am a theorem of I’? Such a sentence, 
H , is a fixed point of Op, since Or(".H ') “says” that (i.e., is true in Ne iff) 
I + .#. A theorem of Léb (1955) shows that such sentences are provable, 
and hence true as well, since then Or(".Y ') is true, and we also know that 
Fr Ho Or KH"). 


1.6.43 Theorem (Léb’s Theorem (1955)). Let PA < I consistently, and 
assume that the formula T is 44(L5z,.). Then, for any sentence .%, 


Fp Op(. 4!) — 4 implies that’ br. 


Proof. Equivalently, let us prove that 
Yr.4@ impliesthat Hp Or(.4') > .4 


So assume that Ip .4. By 14.21, [ + =. 4 is a consistent ¥4(Lyz,) extension 
of C (hence of PA), that is, it is semantically defined by a 44(Z7,) formula. 


Pause. Why isl. +—7—.4 d4(Lm,)? 
By II.6.42, 


Pr4i4 Conry.4 (1) 


Now, we choose as an “implementation” of the sentence “Conr+..z” the sen- 
tence “=Oy(".4 |)’, because of 1.4.21. Thus, (1) can be written as 


Proaz 7Or(™. 47) 
Hence, by modus ponens, 


rp 3.4 3O7(".47) 


The contrapositive is what we want: /r Or(".4 1) 3.4. 


er The statement in IT.6.43 is actually an equivalence. The other direction is triv- 
ially true by tautological implication. 


¥ This question was posed by Henkin (1952). 
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Lob’s theorem can be proved from first principles, i.e., without reliance on 
Gédel’s second incompleteness theorem. One starts by getting a fixed point.7 
of Or(x) — .4inC and then judiciously applying the derivability conditions 
(Exercise II.26). 


Conversely, L6b’s theorem implies Gédel’s (second) theorem: Indeed, the 
tautology 


307('0 = $0") — Or('0 = SO") — 0= SO 
yieldst 
Fr nOr(0 = $0") — O7('0 = SO") — 0= SO 
Thus 
Fr Conp > Or('0 = SO") — 0= SO 
Suppose now that Fr Conr. Then, by the above, 
Fr Or(0= $0") — 0= SO 


Hence, by Lob’s theorem, Fr 0 = SO, contradicting the consistency of I. 


We have simplified the hypotheses in Gédel’s two theorems in allowing I" to 
extend PA (or, essentially equivalently, extend C in the proof of the second). The 
theorems actually hold under a more general hypothesis that [ contains PA (or 
C). This containment can be understood intuitively, at least in the case of formal 
set theory (formalized as ZFC for example):* There is enough machinery inside 
ZFC to construct the set of natural numbers (w) and show that this set satisfies 
the axioms PA.} One can then carry out the arithmetization of Section II.5 for 
terms, formulas, proofs, and theorems, of ZFC this time, in ZFC exactly as we 
did for Peano arithmetic, and prove that both incompleteness theorems hold for 
ZFC.4 


Under the assumption of II.6.32 —-C < I — Or is in the language of I. 
The formal treatment requires the concept of interpreting one theory inside another. This is one 


eo 


of many interesting topics that we must leave out in this brief course in logic. See however our 
volume 2 for a complete discussion of this topic and its bearing on this comment here. 

Thus, one constructs a model of PA inside ZFC. We show how this is done in volume 2. 

In other words, the elements of the set w will furnish us with Gédel numbers for the formulas and 
terms of ZFC, while certain terms and formulas over w will allow ZFC to “argue” about these 
Godel numbers. 
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II.1. 


1.2. 
IL.3. 
IL.4. 
IL5. 
IL.6. 
IL.7. 
IL8. 
IL.9. 
II.10. 
IL11. 
11.12. 
11.13. 
IL.14. 
IL.15. 


II.16. 


IL.17. 
11.18. 
II.19. 


II.20. 


IL.21. 


11.22. 


11.23. 


II. The Second Incompleteness Theorem 
II.7. Exercises 


Prove that PA can actually prove axiom <3, and therefore this axiom is 
dependent (redundant). 

Prove in PA: Fx < y — 7y <x. 

Prove inPA: Fx+y=y+x. 

Prove in PA: kx +(y+z)=(&+y)+2z. 

Provein PA: Fx Xy=y xXx. 

Prove in PA: F x X (y X z) = (* X y) XZ. 

Prove in PA: Fx X (y+z)=( X y) + (@ X 2). 

Settle the Pause regarding the displayed formula (f) on p. 218. 

Prove the formula (LCM’) on p. 230. 

Prove in PA: x > 0 — x = S6(x, SO). 

Prove Lemma II.1.36. 

Prove in PA: x < J(x,y),andt y < J(x, y), where J is that of p. 234. 
Prove the concluding claim in II.3.7. 

Conclude the proof of Lemma II.4.6. 

Prove that there is a unary formal primitive recursive predicate [’ such 
that, for any formula .4, [(".4") means that .4 is (an instance of) a 
Peano axiom. 

Prove that if the formula I’ that semantically defines the nonlogical ax- 
ioms of an extension of PA is 44(Zsy) then so are Proof, Deriv, 
and ©. 

Settle the Pause on p. 273. 

Settle the Pause on p. 289. 

Prove that 


+ Free(i,x) > 0A (Term(x) = 0 V WFF(x) = 0) 
— Sub(x,i,z) =x 


Prove that 
+ Deriv(y,x) — Deriv(y * (x),x) 
Show that it is not true that for arbitrary unary f 
L {fx} = Num( fx) 


Prove that if 7 is Ag(Zm,.) and P is introduced by P¥ + @(*%), then 
P is primitive recursive. 

Prove that the characterizing formula for the nonlogical axioms of the 
theory C — that is, P such that, for any formula. Z, [("_4 ') means that 
_@ is a nonlogical axiom — is primitive recursive. 


1.24. 


11.25. 


11.26. 


11.27. 


11.28. 


11.29. 


11.30. 
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Prove that if the one-variable formulas Q and I are primitive recursive 
and 314(La) respectively, then the formula Q(x) V T(x) is %4(Laq). 
Refer to II.6.42. Prove that it is also true that /r Y — Conr, and thus 


Fp Y < Conp 


Prove Lob’s theorem without the help of Gédel’s second incompleteness 
theorem. 

Prove Tarski’s theorem (see 1.9.31, p. 174) on the (semantic) undefin- 
ability of truth using the fixpoint theorem (II.6.39) to find a sentence that 
says “I am false”. 

Refer to II.6.41. Let P over Ly be an w-consistent extension of PA such 
that P(x) is X4(Lyz,.). Let the sentence “Y over Ly be a fixed point of 
30, inC. Thenl/r A”. 

Let. 4 be any sentence in the language of a’ — where P(x) is }4(Lz,.) — 
that extends PA consistently. Establish that (> ~O(".4') — Conr. 
With the usual assumptions on I and C (II.6.32), prove that 


Ke Conp > Conr+-cony 
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